簡易檢索 / 詳目顯示

研究生: 楊朝旭
Chao-Hsu Yang
論文名稱: 擬似逼真車牌的合成技術
Synthesis of Photorealistic License Plates
指導教授: 鍾聖倫
Sheng-Luen Chung
徐繼聖
Gee-Sern Hsu
口試委員: 鍾聖倫
Sheng-Luen Chung
徐繼聖
Gee-Sern Hsu
郭重顯
Chung-Hsien Kuo
蘇順豐
Shun-Feng Su
方文賢
Wen-Hsien Fang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 72
中文關鍵詞: 車牌偵測與辨識圖對圖的轉換cycleGANAdaIN影像生成styleGAN
外文關鍵詞: License plate detection and recognition, image-to-image translation, cycleGAN, AdaIN, image generation, styleGAN
相關次數: 點閱:191下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 車牌當作汽車的身份證,車牌辨識在門禁、執法、與行駕等情境中非常重要。車牌偵測與辨識 (LPDR) 的技術發展需要很多很多代表性的車牌當作訓練集,然而,車牌影像的取得不僅困難、還有隱私的考量。套用深度學習風格的學習與轉移,本論文的目標是合成逼真的車牌影像。我們探討利用 cycleGAN 與 AdaIN的 image-to-image translation 技術,以及使用 styleGAN 的 image generation 技術。針對合成車牌影像,前者的主要優點是只要提供車牌的模版,再透過對一組或單一真實車牌影像的學習,即可轉移真實車牌的風格到這模版上。然而,後者在經過很多真實車牌影像的訓練而得到歸納的風格後,它就能根據此風格隨機產生逼真的合成車牌。對照來看,影像生成技術能合成更好的逼真車牌。對此,本論文提出一個輕量級的 styleGAN,配合一代表性的 StyleGANLP4075 車牌資料集可以生成逼真的車牌影像。此外,我們也提出一個能確保合成車牌的品質的自動檢測的機制。架構在此技術基礎上,我們建置了一個展示系統用來挑戰人眼判斷哪些車牌圖像是真實的、哪些是合成的。最後,關於合成車牌影像在 LPDR 車牌偵測與辨識的應用,我們以具挑戰性的 LP-2020 車牌資料庫當測試標的,發現:對照只使用真實圖像訓練集時的61.25%,而使用以上各合成車牌當作預訓練集則可提升辨識率到 79.72%。


    A license plate is the ID card of the car. Automatic license plate detection and recognition (LPDR) is very important for access control, law enforcement, and driving in general. The development of a successful LPDR requires a well representative training set of many license plates. However, the acquisition of license plate images is expensive, and is further restricted by privacy concerns. Exploiting the advance by deep learning in the general image generation techniques through style transfer, this study aims to synthesize realistic license plate images. Two synthesis techniques are investigated: image to image translation by cycleGAN and AdaIN; and image generation by styleGAN. In the context of license plate image generation, the former translates a pre-defined simple license template image into a new one flavored with a given style; the latter generates multiple images fitting a given style out of random latent codes without first relying on a template. By comparison, the image generation approach produces better photorealistic images. Accordingly, this study proposes a lightweight sytelGAN architecture with a representative training set StyleGANLP4075 that together generate license plate images with photorealistic quality. In addition, a curation mechanism is also proposed to ensure the quality of the synthetic images. On top of the aforementioned technique, an exhibition system is implemented to challenge the human eye regarding which plates are real and which synthesized. Finally, regarding the application of the synthetic license plate images in LPDR, we used the challenging LP-2020 license plate database as the test target and found that: when only the real image training set is used the LPR recognition accuracy is 61.25%, which is contrasted to 79.72% when the synthetic license plates are additionally used as a pre-training dataset.

    摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Chapter 2: Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Style Transfer and Image Generation . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Neural Style Transfer . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 Unpaired Image-to-Image Translation . . . . . . . . . . . . . . . 5 2.1.3 Generation of High Resolution Images . . . . . . . . . . . . . . . 6 2.2 Data Generation and Augmentation . . . . . . . . . . . . . . . . . . . . 6 2.3 License Plate Detection and Recognition . . . . . . . . . . . . . . . . . . 7 Chapter 3: Image-to-Image Translation . . . . . . . . . . . . . . . . . . . 8 3.1 Regarding Image Style . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Synthesis of License Plate Images . . . . . . . . . . . . . . . . . . . . . 9 3.2.1 The Need of Synthetic License Plate Images . . . . . . . . . . . 9 3.2.2 License Plate Templates . . . . . . . . . . . . . . . . . . . . . . 10 3.2.3 Description Attributes of A License Plate . . . . . . . . . . . . . 11 3.3 Image-to-Image Translation Approaches . . . . . . . . . . . . . . . . . . 12 3.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.3.2 CycleGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.3.3 Neural Style Transfer with AdaIN . . . . . . . . . . . . . . . . . 17 Chapter 4: License Plate Image Generation . . . . . . . . . . . . . . . . . 20 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 Literature Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.2.1 PG-GAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.2.2 Adaptive Instance Normalization(AdaIN) . . . . . . . . . . . . . 23 4.2.3 StyleGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.3 Lightweight StyleGAN for License Plate Generation . . . . . . . . . . . 25 4.3.1 Proposed Light Weighted Structure . . . . . . . . . . . . . . . . 25 4.3.2 Representive License Plate Images . . . . . . . . . . . . . . . . . 26 4.3.3 Curation by Alphanumerical Detector . . . . . . . . . . . . . . . 30 4.4 Experiment Setups and Results . . . . . . . . . . . . . . . . . . . . . . . 32 4.4.1 Setup and Training Process . . . . . . . . . . . . . . . . . . . . . 32 4.4.2 License Plate Generation System . . . . . . . . . . . . . . . . . . 33 Chapter 5: Experiments and Results . . . . . . . . . . . . . . . . . . . . 35 5.1 LP-2020 Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.1.1 Existing Databases . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.1.2 Database Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.1.3 Comparison to Other License Plate Databases by LPD . . . . . . 36 5.2 A Two-staged Solution for LPDR . . . . . . . . . . . . . . . . . . . . . 39 5.2.1 LPDR Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.2.2 LPD Algorithms by YOLO V3 and Mask R-CNN . . . . . . . . . 40 5.2.3 LPR Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.2.4 LPR Results with Pre-trained Models by Synthetic Images . . . . 42 5.2.5 LPDR Results with Pre-trained Models . . . . . . . . . . . . . . 48 Chapter 6: Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Appendix A: Taiwan License Plates . . . . . . . . . . . . . . . . . . . . . . 55 Appendix B: Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    [1] S. Du, M. Ibrahim, M. Shehata, and W. Badawy, “Automatic license plate recognition (alpr): A stateof-the-art review,” IEEE Transactions on circuits and systems for video technology, vol. 23, no. 2, pp. 311–325, 2012.
    [2] L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2414–2423,2016.
    [3] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6924–6932, 2017.
    [4] V. Dumoulin, J. Shlens, and M. Kudlur, “A learned representation for artistic style,” arXiv preprint arXiv:1610.07629, 2016.
    [5] X. Huang and S. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510, 2017.
    [6] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycleconsistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, pp. 2223–2232, 2017.
    [7] M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv preprint arXiv:1411.1784, 2014.
    [8] Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A review and new perspectives,” IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 8, pp. 1798–1828, 2013.
    [9] A. Brock, T. Lim, J. M. Ritchie, and N. Weston, “Neural photo editing with introspective adversarial networks,” arXiv preprint arXiv:1609.07093, 2016.
    [10] T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for improved quality, stability, and variation,” arXiv preprint arXiv:1710.10196, 2017.
    [11] T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4401–4410, 2019.
    [12] M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman, “Synthetic data and artificial neural networks for natural scene text recognition,” arXiv preprint arXiv:1406.2227, 2014.
    [13] Z. Zhong, L. Zheng, Z. Zheng, S. Li, and Y. Yang, “Camstyle: A novel data augmentation method for person re-identification,” IEEE Transactions on Image Processing, vol. 28, no. 3, pp. 1176–1190, 2018.
    [14] H. Jo, Y.-H. Na, and J.-B. Song, “Data augmentation using synthesized images for object detection,” in 2017 17th International Conference on Control, Automation and Systems (ICCAS), pp. 1035–1038, IEEE, 2017.
    [15] P. Wang, P. Chen, Y. Yuan, D. Liu, Z. Huang, X. Hou, and G. Cottrell, “Understanding convolution for semantic segmentation,” in 2018 IEEE winter conference on applications of computer vision (WACV), pp. 1451–1460, IEEE, 2018.
    [16] C. Wu, S. Xu, G. Song, and S. Zhang, “How many labeled license plates are needed?,” in Chinese Conference on Pattern Recognition and Computer Vision (PRCV), pp. 334–346, Springer, 2018.
    [17] A. M. Al-Ghaili, S. Mashohor, A. R. Ramli, and A. Ismail, “Vertical-edge-based car-license-plate detection method,” IEEE transactions on vehicular technology, vol. 62, no. 1, pp. 26–38, 2012.
    [18] V. Tadic, M. Popovic, and P. Odry, “Fuzzified gabor filter for license plate detection,” Engineering Applications of Artificial Intelligence, vol. 48, pp. 40–58, 2016.
    [19] X. Xing, B.-J. Choi, S. Chae, and M.-H. Lee, “Design of a recognizing system for vehicles license plates with english characters,” International Journal of Fuzzy Logic and Intelligent Systems, vol. 9, no. 3, pp. 166–171, 2009.
    [20] F. Zhang, C. Li, and F. Yang, “Vehicle detection in urban traffic surveillance images based on convolutional neural networks with feature concatenation,” Sensors, vol. 19, no. 3, p. 594, 2019.
    [21] K.-J. Kim, P.-K. Kim, Y.-S. Chung, and D.-H. Choi, “Multi-scale detector for accurate vehicle detection in traffic surveillance data,” IEEE Access, vol. 7, pp. 78311–78319, 2019.
    [22] S. Lyu, M.-C. Chang, D. Du, W. Li, Y. Wei, M. Del Coco, P. Carcagnì, A. Schumann, B. Munjal, D.-H. Choi, et al., “Ua-detrac 2018: Report of avss2018 & iwt4s challenge on advanced traffic monitoring,” in 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6, IEEE, 2018.
    [23] Z. Selmi, M. Ben Halima, and A. M. Alimi, “Deep learning system for automatic license plate detection and recognition,” in 2017 14th IAPR International Conference on Document Analysis and Recognition(ICDAR), vol. 01, pp. 1132–1138, Nov 2017.
    [24] H. Li and C. Shen, “Reading car license plates using deep convolutional neural networks and lstms,” arXiv preprint arXiv:1601.05610, 2016.
    [25] J. Španhel, J. Sochor, R. Juránek, A. Herout, L. Maršík, and P. Zemčík, “Holistic recognition of low quality license plates by cnn using track annotated data,” in 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6, Aug 2017.
    [26] X. Hou, M. Fu, X. Wu, Z. Huang, and S. Sun, “Vehicle license plate recognition system based on deep learning deployed to pynq,” in 2018 18th International Symposium on Communications and Information Technologies (ISCIT), pp. 79–84, IEEE, 2018.
    [27] Z. Xu, W. Yang, A. Meng, N. Lu, H. Huang, C. Ying, and L. Huang, “Towards end-to-end license plate detection and recognition: A large dataset and baseline,” in Proceedings of the European conference on computer vision (ECCV), pp. 255–271, 2018.
    [28] R. Laroca, E. Severo, L. A. Zanlorensi, L. S. Oliveira, G. R. Gonçalves, W. R. Schwartz, and D. Menotti, “A robust real-time automatic license plate recognition based on the yolo detector,” in 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–10, IEEE, 2018.
    [29] “Impressionism.” Available from https://en.wikipedia.org/wiki/Impressionism.
    [30] G.-S. Hsu, J.-C. Chen, and Y.-Z. Chung, “Application-oriented license plate recognition,” IEEE transactions on vehicular technology, vol. 62, no. 2, pp. 552–561, 2012.
    [31] “Stylegan —official tensorflow implementation.” Available from https://github.com/NVlabs/stylegan.
    [32] Y. Jiang, X. Zhu, X. Wang, S. Yang, W. Li, H. Wang, P. Fu, and Z. Luo, “R2cnn: rotational region cnn for orientation robust scene text detection,” arXiv preprint arXiv:1706.09579, 2017.
    [33] S. Yang, P. Luo, C.-C. Loy, and X. Tang, “Wider face: A face detection benchmark,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5525–5533, 2016.
    [34] B. Yang, J. Yan, Z. Lei, and S. Z. Li, “Fine-grained evaluation on face detection in the wild,” in 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1, pp. 1–7, IEEE, 2015.
    [35] G. Griffin, A. Holub, and P. Perona, “Caltech-256 object category dataset,” 2007.
    [36] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018.
    [37] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in Proceedings of the IEEE international conference on computer vision, pp. 2961–2969, 2017.

    QR CODE