簡易檢索 / 詳目顯示

研究生: 吳曉琪
Hsiao-Chi Wu
論文名稱: 一個基於物件偵測與神經機器翻譯的使用者介面自動生成方法 ─ 以網頁佈局為例
An Automatic GUI Generating Method Based on Object Detection and Neural Machine Translation: A Case of Webpage Layout
指導教授: 范欽雄
Chin-Shyurng Fahn
口試委員: 施仁忠
Zen-Chung Shih
李建德
Jiann-Der Lee
黃元欣
Yuan-Shin Hwang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 75
中文關鍵詞: 圖形化使用者介面圖形化使用者介面骨架網頁佈局自動化程式生成深度神經網路超文本標記語言物件偵測神經機器翻譯
外文關鍵詞: Graphical User Interface, GUI Skeleton, Webpage Layout, Automatic Code Generation, Deep Neural Networks, HTML, Object Detection, Neural Machine Translation
相關次數: 點閱:263下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著個人電腦與移動裝置的普及,與網路的覆蓋廣闊,各類型應用軟體以及網站快速地出現,為了提高效率以搶佔市場,導致產品開發的時程不斷被壓縮。而圖形化使用者介面是機器能跟人類互動的重要媒介,若是能提高其開發的速度,能讓工程師把更多時間花在其它更需要思考的邏輯功能開發。因此我們提出了一個系統,可以從圖形化使用者介面(GUI)設計圖生成GUI佈局的原型,此系統生成的GUI骨架作為圖形化使用者介面實作的指引,其表示法具有彈性與擴充性,可以根據終端平台的需求,編譯成不同的程式語言;在此論文中,我們將其編譯成網頁的超文本標記語言(HTML)來做示範。
    我們提出兩個系統架構:視覺圖元件轉程式碼(MockupC2Code)和視覺圖元件屬性轉程式碼(MockupCA2Code),它們都基於深度神經網路裡的物件偵測和神經機器翻譯,首先,以物件偵測方法,從GUI設計圖上偵測出元件的類型與相對位置等視覺資訊,接著將視覺資訊輸入基於神經機器翻譯的佈局生成模型,生成GUI骨架。為了適應含有多種元件樣式的設計圖,MockupCA2Code能以較靈活的方式學習與表達樣式,此架構在物件偵測模型和佈局生成模型之間加入了屬性識別模型,來補充說明元件的樣式。
    在訓練與實驗上,我們除了採用Pix2Code提出的資料集,也自行生成了一個含有更多種元件樣式與更複雜佈局的資料集。MockupC2Code的BLEU分數在Pix2Code資料集和我們資料集分別為91.50%與62.99%;MockupCA2Code則分別為91.80%和58.91%,而從輸入GUI設計圖開始到輸出HTML,約需2至3秒的執行時間,能有效加快GUI程式的開發進程。


    With the popularity of personal computers and mobile devices, and the wide coverage of the Internet, various types of application software and websites are rapidly appearing. In order to improve efficiency and occupy the market, the time period for product development is continuously shortened. Graphical user interface (GUI) is an important medium for machines to interact with humans. If the development speed of the GUI can be increased in the release schedule, the engineer can spend more time on other logic function development that needs more thinking. Therefore, we propose a system that can generate GUI layout prototypes from GUI mockup. The GUI skeleton generated by our system serves as a guidance for GUI implementation. Such a representation is flexible and expandable, and can be compiled into different programming languages according to the requirements of the terminal platform. In this thesis, we compile it into the HTML of webpages for demonstration.
    We propose two system structures: Mockup Component-to-Code (MockupC2Code) and Mockup Component Attribute-to-Code (MockupCA2Code), both of which are based on object detection and neural machine translation in deep neural networks. First, the object detection method is used to detect visual information such as the type and relative position of GUI components from the GUI mockup. Then the visual information is fed into a layout generation model based on neural machine translation to generate a GUI skeleton. In order to adapt to the GUI mockup with multiple GUI component styles, MockupCA2Code can learn and express styles in a more flexible way. It adds attribute identification model between object detection model and layout generation model to supplement the style of GUI components.
    In training and experiments, in addition to using the dataset proposed by Pix2Code, we also generate a dataset with more GUI components and more complex layouts. The BLEU scores of MockupC2Code are respectively 91.50% and 62.99% in the Pix2Code dataset and our dataset, while in MockupCA2Code, they are 91.80% and 58.91% respectively. It takes about 2 to 3 seconds from inputting the GUI mockup to outputting HTML. It can effectively speed up the development process of GUI programs.

    中文摘要 i Abstract ii 致謝 iv Contents v List of Figures vii List of Tables x Chapter 1 Introduction 1 1.1 Overview 1 1.2 Motivation 3 1.3 System Descriptions 4 1.4 Thesis Organization 7 Chapter 2 Related Work about Code Generation 8 2.1 Encoder-Decoder Framework 8 2.2 Neural Machine Translation 11 2.3 Object Detection Method 12 Chapter 3 GUI Components Detection 16 3.1 Dataset Preprocessing 16 3.2 Object Detection 18 3.2.1 Bounding Box Prediction 19 3.2.2 Network Architecture 21 3.2.3 Loss Function 24 3.3 Attribute Identification 26 3.3.1 Feature Extraction by CNN Model 27 3.3.2 Attribute Identification by RNN Model 28 Chapter 4 Automatic GUI Generation 31 4.1 Our GUI Skeleton Format 31 4.2 Layout Generation 32 4.2.1 Encoding by Stack Bidirectional LSTM 34 4.2.2 Decoding by Stack LSTM 35 4.3 GUI Program Compilation 37 4.3.1 GUI Skeleton Token Tree Translation 37 4.3.2 Webpage Design Framework-Bootstrap 39 4.3.3 Printed Layout Rendering 41 Chapter 5 Experimental Results and Discussions 43 5.1 Experimental Setup 43 5.1.1 The Dataset: Pix2Code Dataset 45 5.1.2 The Dataset: Our Generated Dataset 46 5.2 Results of Object Detection 49 5.2.1 Results of MockupC2Code 50 5.2.2 Results of MockupCA2Code 52 5.3 Results of Attributes Identification 57 5.4 Results of Layout Generation 59 5.5 Results of Our System 64 Chapter 6 Conclusions and Future Work 69 6.1 Contributions and Conclusions 69 6.2 Future Work 71 References 73

    [1] J. M. Rivero et al., “Mockup-driven development: Providing agile support for model-driven web engineering,” Information and Software Technology, vol. 56, no. 6, pp. 670-687, 2014.
    [2] S. ye, “Wireframe vs Mockup vs Prototype, What's the Difference?(2020 Updated),” 2017. [Online]. Available: https://www.mockplus.com/blog/post/wireframe-mockup-prototype-selection-of-prototyping-tools. [Accessed April 19, 2020].
    [3] Sketch team, “Sketch — The digital design toolkit,” [Online]. Available: https://www.sketch.com/. [Accessed May 26, 2020].
    [4] Adobe, “UI/UX design and collaboration tool | Adobe XD,” [Online]. Available: https://www.adobe.com/products/xd.html. [Accessed May 26, 2020].
    [5] Microsoft, “Sketch2Code,” 2018. [Online]. Available: https://www.microsoft.com/ en-us/ai/ai-lab-sketch2code. [Accessed March 30, 2020].
    [6] O. Popelka and J. Štastný, “Automatic generation of programs,” in Advances in Computer Science and Engineering, M. Schmidt Ed. Rijeka, Croatia: IntechOpen, ch. 2, pp. 17-36, 2011.
    [7] X. Pang et al., “A novel syntax-aware automatic graphics code generation with attention-based deep neural network,” Journal of Network and Computer Applications, vol. 161, 2020.
    [8] T. Beltramelli, “Pix2Code: Generating code from a graphical user interface screenshot,” in Proceedings of the ACM SIGCHI Symposium on Engineering Interactive Computing Systems, Paris, France, pp. 1-6, 2018.
    [9] Y. Liu, Q. Hu, and K. Shu, “Improving pix2code based Bi-directional LSTM,” in Proceedings of the IEEE International Conference on Automation, Electronics and Electrical Engineering, Shenyang, China, pp. 220-223, 2018.
    [10] E. Wallner, “Turning Design Mockups Into Code With Deep Learning,” 2018. [Online]. Available: https://blog.floydhub.com/turning-design-mockups-into-code-with-deep-learning/. [Accessed March 31, 2020].
    [11] Z. Zhu, Z. Xue, and Z. Yuan, “Automatic graphics program generation using attention-based hierarchical decoder,” in Proceedings of the Asian Conference on Computer Vision, Perth, Australia, pp. 181-196, 2018.
    [12] Y. Shen et al., “Ordered neurons: Integrating tree structures into recurrent neural networks,” arXiv preprint arXiv:1810.09536, 2018.
    [13] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Proceedings of the Advances in neural information processing systems, Montréal, Canada, pp. 3104-3112, 2014.
    [14] C. Chen et al., “From UI design image to GUI skeleton: A neural machine translator to bootstrap mobile GUI implementation,” in Proceedings of the 40th International Conference on Software Engineering, Gothenburg, Sweden, pp. 665-676, 2018.
    [15] T. A. Nguyen and C. Csallner, “Reverse engineering mobile application user interfaces with remaui (t),” in Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, Lincoln, NE, pp. 248-259, 2015.
    [16] B. Aşıroğlu et al., “Automatic HTML code generation from mock-up images using machine learning techniques,” in Proceedings of the Scientific Meeting on Electrical-Electronics & Biomedical Engineering and Computer Science, Istanbul, Turkey, pp. 1-4, 2019.
    [17] C. C. Chang, “An automatic GUI generating method from hand-drawn sketch to neat tableau based on deep neural networks: A case of webpage layout,” M. S. thesis, Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, 2019.
    [18] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018.
    [19] S. Ren et al., “Faster R-CNN: Towards real-time object detection with region proposal networks,” in Proceedings of the Advances in neural information processing systems, Montréal, Canada, pp. 91-99, 2015.
    [20] T.-Y. Lin et al., “Feature pyramid networks for object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, pp. 2117-2125, 2017.
    [21] P. Kinghorn, L. Zhang, and L. Shao, “A region-based image caption generator with refined descriptions,” Neurocomputing, vol. 272, pp. 416-424, 2018.
    [22] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
    [23] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
    [24] Z. Cui et al., “Deep bidirectional and unidirectional LSTM recurrent neural network for network-wide traffic speed prediction,” arXiv preprint arXiv:1801.02143, 2018.
    [25] Z. Huang, W. Xu, and K. Yu, “Bidirectional LSTM-CRF models for sequence tagging,” arXiv preprint arXiv:1508.01991, 2015.
    [26] Bootstrap team, “Bootstrap · The most popular HTML, CSS, and JS library in the world,” 2011. [Online]. Available: https://getbootstrap.com/. [Accessed May 24, 2020].
    [27] D. M. Powers, “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation,” Journal of Machine Learning Technologies, vol. 2, no. 1, pp. 37-63, 2011.
    [28] M. Everingham et al., “The pascal visual object classes (voc) challenge,” International Journal of Computer Vision, vol. 88, no. 2, pp. 303-338, 2010.
    [29] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv:1409.0473, 2014.
    [30] M.-T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine translation,” arXiv preprint arXiv:1508.04025, 2015.
    [31] K. Papineni et al., “BLEU: A method for automatic evaluation of machine translation,” in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, Pennsylvania, pp. 311-318, 2002.
    [32] A. Vaswani et al., “Attention is all you need,” in Proceedings of the Advances in Neural Information Processing Systems, Long Beach, California, pp. 5998-6008, 2017.

    無法下載圖示 全文公開日期 2025/07/23 (校內網路)
    全文公開日期 2030/07/23 (校外網路)
    全文公開日期 2030/07/23 (國家圖書館:臺灣博碩士論文系統)
    QR CODE