研究生: |
魏禎佑 Chen-Yu Wei |
---|---|
論文名稱: |
風格保留之指定字元車牌影像生成 Style-Preserving Generation of Synthetic License Plate with Desired Characters |
指導教授: |
徐繼聖
Gee-Sern Hsu |
口試委員: |
郭景明
Jing-Ming Guo 陳祝嵩 Chu-Song Chen 莊永裕 Yung-Yu Chuang 林嘉文 Chia-Wen Lin |
學位類別: |
碩士 Master |
系所名稱: |
工程學院 - 機械工程系 Department of Mechanical Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 48 |
中文關鍵詞: | 生成對抗網路 、車牌辨識 、場景文字辨識 、風格轉換 |
外文關鍵詞: | Generative Adversarial Networks, License Plate Recognition, Scene Text Recognition, Style Transfer |
相關次數: | 點閱:199 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
我們提出了一種用於車牌圖像合成的風格保留生成器 (SPG) 。給定一個符合車牌規範的任意字串,提出之 SPG 可以生成具有給定內容的字串圖像並保留來源車牌的原始風格。生成的車牌圖像可用於訓練車牌識別 (LPR) 系統,使其優於在真實資料上進行訓練的相同系統,因為前者可以有效處理後者的內容不平衡問題。同時前者還能夠解決後者的隱私問題。所提出的 SPG 由輸入風格網路(ISN)、前景提取網路(FEN)、背景恢復網路(BRN)和融合網路(FN)組成。 ISN 將所需的輸入字串轉換為兩個符合來源圖風格的前景圖像。 FEN 提取風格來源圖像中前景字串的分割圖。 BRN 恢復樣式風格來源圖像的背景部分並刪除了前景字串。 FN 將轉換後的前景分割圖和恢復的背景的特徵融合到目標圖像中,目標圖像保留了來源影像的風格,並將字串替換為所需的字串。我們展示了提出的方法在 AOLP 和 LP2022 資料庫展現效能,亦應用所提出的方法來合成中文場景文字以演示延伸的應用。
We propose a Style Preserving Generator (SPG) for license plate image synthesis. Given an arbitrary string that follows the license plate content regulation, the proposed SPG can generate images that shows the given string and preserves the in-the-wild styles of license plates. The generated license plate images can be used to train a License Plate Recognition (LPR) system, and make it outperform the same system but trained on real data, as the former can effectively handle the content imbalance issue of the latter. The former can also address the privacy issue of the latter. The proposed SPG is composed of the Input Styling Network (ISN), the Foreground Extraction Network (FEN), the Background Restoration Network (BRN) and the Fusion Net (FN). The ISN transforms the desired input string to two foreground images compliant with a style source image. The FEN extracts the segmentation map of the foreground string in the style source image. The BRN restores the background part of the style reference image with foreground string removed. The FN fuses the features of the transformed foreground segmentation map and the restored background to the target image that keeps the style of the source image but with the string replaced by the desired one. We demonstrate the performance of our approach on the AOLP and LP2022 benchmarks. We also apply the proposed approach to synthesize Chinese scene texts to demonstrate an extended application.
[1] A. M. Al-Ghaili, S. Mashohor, A. R. Ramli, and A. Ismail. Vertical-edge-based car-license-plate detection method. In IEEE Transactions on Vehicular Technology, volume 62, pages 26–38, 2013.
[2] J. Baek, G. Kim, J. Lee, S. Park, D. Han, S. Yun, S. J. Oh, and H. Lee. What is wrong with scene text recognition model comparisons? dataset and model analysis. In ICCV, pages 4714–4722, 2019.
[3] L. A. Gatys, A. S. Ecker, and M. Bethge. Image style transfer using convolutional neural networks. In CVPR, pages 2414–2423, 2016.
[4] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In NeurIPS, pages 2672—-2680, 2014.
[5] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Self - attention generative adversarial networks. In arXiv: Machine Learning, 2018.
[6] A. Graves, M. Liwicki, S. Fern ́andez, R. Bertolami, H. Bunke, and J. Schmidhuber. A novel connectionist system for unconstrained handwriting recognition. In IEEE Trans-actions on Pattern Analysis and Machine Intelligence, pages 855–868, 2009.
[7] A. Gupta, A. Vedaldi, and A. Zisserman. Synthetic data for text localisation in natural images. In CVPR, pages 2315—-2324, 2016.
[8] G.-S. Hsu, J.-C. Chen, and Y.-Z. Chung. Application- oriented license plate recognition. In IEEE Transactions on Vehicular Technology, volume 62, pages 552–561, 2013.
[9] X. Huang and S. J. Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV, 2018
[10] P. Isola, J. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In CVPR, pages 1125–1134, 2017.
[11] M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman. Synthetic data and artificial neural networks for natural scene text recognition. In Workshop on Deep Learning, NIPS, 2014.
[12] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, pages 694–711, 2016.
[13] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV, pages 2223–2232, 2017.
[14] T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In ICLR, 2018.
[15] T. Karras, S. Laine, and T. Aila. A style-based generator architecture for generative adversarial networks. In CVPR, pages 4401–4410, 2019.
[16] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila. Analyzing and improving the image quality of style-gan. In CVPR, pages 8107–8116.
[17] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, page 13, 2015.
[18] F. Milletari, N. Navab, and S. Ahmadi. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In IC3DV. IEEE, pages 565–571, 2016.
[19] Raveen Krishnan, R. Kovvuri, G. Pang, B. Vassilev, and T. Hassner. Textstylebrush: Transfer of text aesthetics from a single example. In arXiv, 2021.
[20] P. Roy, S. Bhattacharya, S. Ghosh, and U. Pal. Stefann: Scene text editor using font adaptive neural network. In CVPR, 2020.
[21] L. Wu, C. Zhang, J. Liu, J. Han, J. Liu, E. Ding, and X. Bai. Editing text in the wild. In Association for Computing Machinery, pages 1500—-1508, 2019.
[22] Q. Yang, H. Jin, J. Huang, and W. Lin. Swaptext: Image based texts transfer in scenes. In CVPR, 2020.
[23] C. Yu, M. Xie, and J. Qi. A novel system design of license plate recognition. In International Symposium on Computational Intelligence and Design, volume 2, pages 114–117, 2008.
[24] X. Yue, Z. Kuang, C. Lin, H. Sun, and W. Zhang. Robust-scanner: Dynamically enhancing positional clues for robust text recognition. In ECCV, pages 135–151, 2020.
[25] F. Zhan, H. Zhu, and S. Lu. Spatial fusion gan for image synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3653—-3662, 2018.
[26] S. Zherzdev and A. Gruzdev. Lprnet: License plate recognition via deep neural networks. In arXiv, 2018.
[27] Traditional Chinese Scene Text Recognition Competition - Senior Tournament (https://www.aicup.tw/ai-cup-2021).