簡易檢索 / 詳目顯示

研究生: 魏禎佑
Chen-Yu Wei
論文名稱: 風格保留之指定字元車牌影像生成
Style-Preserving Generation of Synthetic License Plate with Desired Characters
指導教授: 徐繼聖
Gee-Sern Hsu
口試委員: 郭景明
Jing-Ming Guo
陳祝嵩
Chu-Song Chen
莊永裕
Yung-Yu Chuang
林嘉文
Chia-Wen Lin
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 48
中文關鍵詞: 生成對抗網路車牌辨識場景文字辨識風格轉換
外文關鍵詞: Generative Adversarial Networks, License Plate Recognition, Scene Text Recognition, Style Transfer
相關次數: 點閱:199下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

我們提出了一種用於車牌圖像合成的風格保留生成器 (SPG) 。給定一個符合車牌規範的任意字串,提出之 SPG 可以生成具有給定內容的字串圖像並保留來源車牌的原始風格。生成的車牌圖像可用於訓練車牌識別 (LPR) 系統,使其優於在真實資料上進行訓練的相同系統,因為前者可以有效處理後者的內容不平衡問題。同時前者還能夠解決後者的隱私問題。所提出的 SPG 由輸入風格網路(ISN)、前景提取網路(FEN)、背景恢復網路(BRN)和融合網路(FN)組成。 ISN 將所需的輸入字串轉換為兩個符合來源圖風格的前景圖像。 FEN 提取風格來源圖像中前景字串的分割圖。 BRN 恢復樣式風格來源圖像的背景部分並刪除了前景字串。 FN 將轉換後的前景分割圖和恢復的背景的特徵融合到目標圖像中,目標圖像保留了來源影像的風格,並將字串替換為所需的字串。我們展示了提出的方法在 AOLP 和 LP2022 資料庫展現效能,亦應用所提出的方法來合成中文場景文字以演示延伸的應用。


We propose a Style Preserving Generator (SPG) for license plate image synthesis. Given an arbitrary string that follows the license plate content regulation, the proposed SPG can generate images that shows the given string and preserves the in-the-wild styles of license plates. The generated license plate images can be used to train a License Plate Recognition (LPR) system, and make it outperform the same system but trained on real data, as the former can effectively handle the content imbalance issue of the latter. The former can also address the privacy issue of the latter. The proposed SPG is composed of the Input Styling Network (ISN), the Foreground Extraction Network (FEN), the Background Restoration Network (BRN) and the Fusion Net (FN). The ISN transforms the desired input string to two foreground images compliant with a style source image. The FEN extracts the segmentation map of the foreground string in the style source image. The BRN restores the background part of the style reference image with foreground string removed. The FN fuses the features of the transformed foreground segmentation map and the restored background to the target image that keeps the style of the source image but with the string replaced by the desired one. We demonstrate the performance of our approach on the AOLP and LP2022 benchmarks. We also apply the proposed approach to synthesize Chinese scene texts to demonstrate an extended application.

摘 要 2 Abstract 3 致謝 4 目錄 5 圖目錄 7 表目錄 9 第1章 介紹 10 1.1 研究動機 10 1.2 論文貢獻 10 1.3 論文架構 11 1.4 方法概述 12 第2章 文獻回顧 15 2.1 SR Net 15 2.2 Swap text 16 2.3 STEFFANN 17 2.4 TextStyleBrush 18 2.5 SAGAN 19 2.6 SF-GAN 20 2.7 License Plate Recognition 21 第3章 主要方法 22 3.1 輸入風格化網路 22 3.2 前景提取網路 24 3.3 背景修復網路 25 3.4 融合網路 26 3.5 鑑別器 28 3.6 訓練流程 29 第4章 實驗結果與分析 31 4.1 資料庫介紹 31 4.1.1 合成資料集 31 4.1.2 LP2022車牌資料庫 32 4.1.3 AOLP車牌資料庫 34 4.2 實驗參數 35 4.3 評估指標 35 4.4 消融實驗 36 4.4.1 前景提取網路 36 4.4.2 自注意力 37 4.4.3 空間變換網路 37 4.4.4 LP2022 Fine-Tune 37 4.4.5 辨識損失 38 4.5 資料集上的分析結果 39 4.6 LP2022合成資料集 40 4.7 繁體中文場景文字 41 4.7.1 中文合成資料集 42 4.7.2 中文合成資料集 43 第5章 結論與未來研究方向 45 第6章 參考文獻 46

[1] A. M. Al-Ghaili, S. Mashohor, A. R. Ramli, and A. Ismail. Vertical-edge-based car-license-plate detection method. In IEEE Transactions on Vehicular Technology, volume 62, pages 26–38, 2013.
[2] J. Baek, G. Kim, J. Lee, S. Park, D. Han, S. Yun, S. J. Oh, and H. Lee. What is wrong with scene text recognition model comparisons? dataset and model analysis. In ICCV, pages 4714–4722, 2019.
[3] L. A. Gatys, A. S. Ecker, and M. Bethge. Image style transfer using convolutional neural networks. In CVPR, pages 2414–2423, 2016.
[4] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In NeurIPS, pages 2672—-2680, 2014.
[5] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Self - attention generative adversarial networks. In arXiv: Machine Learning, 2018.
[6] A. Graves, M. Liwicki, S. Fern ́andez, R. Bertolami, H. Bunke, and J. Schmidhuber. A novel connectionist system for unconstrained handwriting recognition. In IEEE Trans-actions on Pattern Analysis and Machine Intelligence, pages 855–868, 2009.
[7] A. Gupta, A. Vedaldi, and A. Zisserman. Synthetic data for text localisation in natural images. In CVPR, pages 2315—-2324, 2016.
[8] G.-S. Hsu, J.-C. Chen, and Y.-Z. Chung. Application- oriented license plate recognition. In IEEE Transactions on Vehicular Technology, volume 62, pages 552–561, 2013.
[9] X. Huang and S. J. Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV, 2018
[10] P. Isola, J. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In CVPR, pages 1125–1134, 2017.
[11] M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman. Synthetic data and artificial neural networks for natural scene text recognition. In Workshop on Deep Learning, NIPS, 2014.
[12] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, pages 694–711, 2016.
[13] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV, pages 2223–2232, 2017.
[14] T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In ICLR, 2018.
[15] T. Karras, S. Laine, and T. Aila. A style-based generator architecture for generative adversarial networks. In CVPR, pages 4401–4410, 2019.
[16] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila. Analyzing and improving the image quality of style-gan. In CVPR, pages 8107–8116.
[17] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, page 13, 2015.
[18] F. Milletari, N. Navab, and S. Ahmadi. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In IC3DV. IEEE, pages 565–571, 2016.
[19] Raveen Krishnan, R. Kovvuri, G. Pang, B. Vassilev, and T. Hassner. Textstylebrush: Transfer of text aesthetics from a single example. In arXiv, 2021.
[20] P. Roy, S. Bhattacharya, S. Ghosh, and U. Pal. Stefann: Scene text editor using font adaptive neural network. In CVPR, 2020.
[21] L. Wu, C. Zhang, J. Liu, J. Han, J. Liu, E. Ding, and X. Bai. Editing text in the wild. In Association for Computing Machinery, pages 1500—-1508, 2019.
[22] Q. Yang, H. Jin, J. Huang, and W. Lin. Swaptext: Image based texts transfer in scenes. In CVPR, 2020.
[23] C. Yu, M. Xie, and J. Qi. A novel system design of license plate recognition. In International Symposium on Computational Intelligence and Design, volume 2, pages 114–117, 2008.
[24] X. Yue, Z. Kuang, C. Lin, H. Sun, and W. Zhang. Robust-scanner: Dynamically enhancing positional clues for robust text recognition. In ECCV, pages 135–151, 2020.
[25] F. Zhan, H. Zhu, and S. Lu. Spatial fusion gan for image synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3653—-3662, 2018.
[26] S. Zherzdev and A. Gruzdev. Lprnet: License plate recognition via deep neural networks. In arXiv, 2018.
[27] Traditional Chinese Scene Text Recognition Competition - Senior Tournament (https://www.aicup.tw/ai-cup-2021).

無法下載圖示 全文公開日期 2024/09/28 (校內網路)
全文公開日期 2024/09/28 (校外網路)
全文公開日期 2024/09/28 (國家圖書館:臺灣博碩士論文系統)
QR CODE