簡易檢索 / 詳目顯示

研究生: 莊豐吉
Feng-Chi Chuang
論文名稱: Component-Font GAN: 基於文字部件的中文字體生成模型
Component-Font GAN: Generating Chinese Fonts Based on Chinese Component
指導教授: 項天瑞
Tien-Ruey Hsiang
口試委員: 楊傳凱
Chuan-Kai Yang
鄧惟中
Wei-Chung Teng
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 32
中文關鍵詞: 字體生成深度學習
外文關鍵詞: font generation, deep learning
相關次數: 點閱:233下載:44
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 中文字體生成是一項艱難且複雜的任務,傳統的字體生成需耗費大量人力與時 間,兩萬多個漢字需要逐一設計。近幾年深度學習興起且受到各領域廣泛應用, 而中文字體生成也試著利用深度學習解決傳統方式上所遇到的難題。近期研究主 要以生成對抗網路為核心理念,會藉由大量不同字文字作為輸入並透過神經網路 進行訓練,從這些訓練資料中擷取其粗細、風格等字體特徵,加以轉換再生成所 需的字體。而在現階段的中文字體生成還是有以下幾個問題需要解決。其一,訓 練時所需要準備的資料集龐大,其中的中文字數量動輒上千筆,雖然跟傳統方法 相比已能大幅減少人力設計成本,但仍然存在改善空間。其二,透過生成對抗網 路所產生的結果圖容易出現誤差,導致結果圖的線條出現歪斜、彎曲等不合理的 筆劃。其三,由於中文字筆畫較繁瑣造成生成對抗網路轉換上的負擔,在文字的 細節處容易產生模糊。在本研究中,我們提出了 Component-Font GAN 架構來生 成想要的中文字體,將文字部件圖片、形體結構轉換成特徵向量來有效的改善文 字筆劃不合理的問題。同時,以 CoordConv 取代生成器中的一般卷積神經網 路,能使結果圖的細節處有更高的質量。由於整體架構的優化,使得訓練集所需 要的文字能夠有效的減少 30%。為了驗證 Component-Font GAN 的成果,我們以 GBK 中 21,886 個漢字加上不同的字體,並與 zi2zi 、Dense-UFont 做比較。我們的 Component-Font GAN 模型相比於其他模型,不論字數多寡、字體風格,圖像的生 成品質皆更好且在 Frechet Inception Distance score 上有顯著的提升。


    Generating Chinese fonts is a difficult and complicated task. Traditional font generation takes a lot of manpower and time, and more than 20,000 Chinese characters need to be designed one by one. In recent years, deep learning was all the rage and been widely used in various fields, and Chinese font generation also tries to use deep learning to solve the problems encountered in traditional methods. Recent research mainly focuses on genera- tive adversarial networks. A large number of different characters will be used as input and trained through neural networks. Font features such as font size and style will be extracted from these training data, and features will convert into the target font image. However, the generation of Chinese fonts still has the following problems to be solved. (1) The data for training is huge, and the number of Chinese characters in training data is thousands of char- acters. Compared with the traditional method, the cost of manpower can be reduced, but there is still room for improvement. (2) Errors are prone to occur in the result through the GAN, wrong strokes such as skewed and curved lines in the result. (3) Due to the complex strokes of Chinese characters, it is a burden to generative adversarial networks, and the details of the result graph will be blurred. In this study, we propose the Component-Font GAN to generate desired Chinese fonts. Convert Chinese component pictures and Chinese structures into features to effectively improve the problem of wrong strokes. At the same time, replacing the general convolution neural network in the generator with CoordConv can make the details of the result graph have higher quality. Due to the optimization of the overall structure, the text required for the training data can be effectively reduced by 30%. In order to verify the results of Component-Font GAN, we use 21,886 Chinese characters II in GBK and two different fonts to be training data, and compared them with zi2zi and Dense-UFont. Compared with other models, our Component-Font GAN model has better image quality and a significant improvement in Frechet Inception Distance score.

    目 錄 中文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I 英文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II 誌謝 . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . IV 目錄 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V 圖目錄 . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . VII 表目錄 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII 1 論文簡介 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 動機與目的 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 論文架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 字體生成的相關研究 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1 基於電腦視覺的字體生成研究 . . . . . . . . . . . . . . . . . . . . . . 5 2.2 基於生成對抗網路的風格轉換研究 . . . . . . . . . . . . . . . . . . . 6 2.3 基於生成對抗網路的字體生成研究 . . . . . . . . . . . . . . . . . . . 6 3 整體架構與建構細相 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1 Component-Font GAN 架構 . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 風格特徵模組 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.1 CoordConv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3 部件特徵模組 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.4 結構特徵模組 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.5 生成器的建置細項 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.6 損失函數 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4 實驗模擬與結果評估 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.1 資料集 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2 評估方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.1 Frechet Inception Distance score . . . . . . . . . . . . . . . . . . 18 4.3 實驗驗證 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3.1 訓練集多寡實驗 . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3.2 泛化性實驗 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.3.3 定性評估 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.3.4 消融實驗 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.3.5 資料擴充實驗 . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.3.6 目標函數權重實驗 . . . . . . . . . . . . . . . . . . . . . . . . 27 4.4 實驗小結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5 結論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 參考文獻 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    參 考 文 獻
    [1] R. Liu, J. Lehman, P. Molino, F. P. Such, E. Frank, A. Sergeev, and J. Yosinski, “An
    intriguing failing of convolutional neural networks and the coordconv solution,” in
    32nd Conference on Neural Information Processing Systems, 2018.
    [2] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained
    by a two time-scale update rule converge to a local nash equilibrium,” in Advances
    in neural information processing systems, pp. 6626–6637, 2017.
    [3] Y. Tian, “zi2zi: Master chinese calligraphy with conditional adversarial networks,”
    2017.
    [4] T.-W. Weng, “Dense-ufont:generating chinese fonts from a small font sample set,”
    2022.
    [5] T. Weng, “Dense-ufont:generating chinese fonts from a small font sample set,” 2022.
    [6] Y. Jiang, Z. Lian, Y. Tang, and J. Xiao, “Scfont: Structure-guided chinese font gen-
    eration via deep stacked networks,” The Thirty-Third AAAI Conference on Artificial
    Intelligence (AAAI-19), 2019.
    [7] S. Xu, F. C. Lau, W. K. Cheung, and Y. Pan, “Automatic generation of artistic chinese
    calligraphy,” IEEE Intelligent Systems, vol. 20, no. 3, p. 32–39, 2005.
    [8] S. Xu, H. Jiang, T. Jin, and F. C. Lau, “Automatic generation of personal chinese
    handwriting by capturing the characteristics of personal handwriting,” in Innovative
    Applications of Artificial Intelligence, 2009.
    [9] B. Zhou, W. Wang, and Z. Chen, “Easy generation of personal chinese handwritten
    fonts,” in IEEE International Conference on Multimedia and Expo, pp. 1–6, 2011.
    [10] J.-W. Lin, C.-Y. Hong, R.-I. Chang, Y.-C. Wang, S.-Y. Lin, and J.-M. Ho, “Complete
    font generation of chinese characters in personal handwriting style,” in International
    Performance Computing and Communications Conference (IPCCC), pp. 1–5, 2015.
    [11] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
    A. Courville, and Y. Bengio, “Generative adversarial nets,” Advances in neural in-
    formation processing systems, vol. 27, 2014.
    [12] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with con-
    ditional adversarial networks,” in Proceedings of the IEEE conference on computer
    vision and pattern recognition, pp. 1125–1134, 2018.
    [13] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation
    using cycle-consistent adversarial networks,” arXiv preprint, 2017.
    [14] Y. Tian, “Rewrite: Neural style transfer for chinese fonts.,” 2016.
    [15] Y. Jiang, Z. Lian, Y. Tang, and J. Xiao, “Dcfont: an end-to-end deep chinese font
    generation system,” in SIGGRAPH Asia 2017 Technical Briefs, pp. 1–4, 2017.
    [16] N. T. N. U. M. T. Center, “Access(advanced chinese character electronic search sys-
    tem).”

    QR CODE