簡易檢索 / 詳目顯示

研究生: 張舒涵
Shu-Han Chang
論文名稱: 以深度學習技術進行自動化拼圖辨識
Automated Puzzle Recognition Based on Deep Learning Techniques
指導教授: 林清安
Alan C. Lin
口試委員: 張復瑜
小林博仁
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 93
中文關鍵詞: 拼圖辨識深度學習機器視覺YOLO演算法數據擴增
外文關鍵詞: Puzzle recognition, Deep learning, Machine vision, YOLO algorithm, Data augmentation
相關次數: 點閱:171下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

組裝一副完整的拼圖需要耗費相當多的時間和精力,隨著機器視覺和深度學習技術的快速發展,本論文嘗試將這兩項技術應用在自動化拼圖辨識,推動技術應用的創新層面。
本論文提出了一種基於深度學習的YOLO演算法,藉由演算法輸出的預測框求取拼圖類別、位置及旋轉量,來克服傳統使用影像處理技術進行拼圖辨識所面臨的問題。本論文主要研究事項包括:
(1)以旋轉方法自動化產生深度學習訓練所需的數據集,以利有效減輕數據標註和收集的時間,同時提供多樣化和龐大的數據量,從而改善深度學習模型的訓練和性能。
(2)建立一套拼圖辨識系統,利用反正切函數運算出隨機擺放拼圖片的類別、位置及旋轉量,使拼圖辨識達到自動化的功效。
除了提出自動化拼圖辨識的基礎研究方法以外,本論文亦以隨意擺放在工作台、但沒有重疊的9片拼圖為實例,測試所開發的拼圖辨識系統是否能自動化辨識每一片拼圖的類別、位置及旋轉量,並將零散的拼圖片進行重組,完成一副完整拼圖,結果驗證了系統在未重疊拼圖片的辨識上有相當高的成功率以及強健性,但其缺點是會受到預測框及拼圖類別的預測機率影響而產生些許方位判斷誤差。


Assembling a complete jigsaw puzzle requires a significant investment of time and effort. In light of the rapid advancements in machine vision and deep learning technologies, this thesis aims to explore their application in automating puzzle recognition, thereby fostering innovation in technological progress.
This thesis introduces a deep learning-based YOLO algorithm designed to overcome the challenges encountered by traditional image processing methods in puzzle recognition. The algorithm leverages predicted bounding boxes to ascertain the puzzle's type, position and rotation. The key research areas covered in this thesis include:
(1)Automated generation of training datasets using rotation methods, which effectively reduces the time needed for data annotation and collection. This approach provides diverse and extensive datasets that enhance the training and performance of deep learning models.
(2)Development of a puzzle recognition system that employs arctangent functions to compute the type, position and rotation of puzzle pieces placed randomly. This system achieves automation in puzzle recognition.
In addition to proposing fundamental research methods for automated puzzle recognition, this thesis presents a practical example utilizing nine non-overlapping puzzle pieces randomly positioned on a workbench. The developed puzzle recognition system is tested to automatically identify the type, position and rotation of each puzzle piece, leading to the successful reassembly of the scattered puzzle pieces into a complete puzzle. The results demonstrate a high success rate and robustness in recognizing non-overlapping puzzle pieces, albeit with some potential errors in positional judgment due to the influence of predicted bounding boxes and the probability of puzzle types.

目錄 摘要 I Abstract II 誌謝 III 目錄 IV 圖目錄 VII 表目錄 XI 第一章 緒論 1 1.1研究動機與目的 1 1.2研究方法 1 1.3文獻探討 2 1.4論文架構 9 第二章 拼圖數據 11 2.1拼圖種類篩選 11 2.2拼圖影像辨識資訊 13 2.2.1拼圖影像介紹 14 2.2.2演算法中的真實框及預測框 15 2.2.3尋找拼圖位置 15 2.2.4拼圖旋轉量 17 2.3 拼圖數據集 21 2.3.1單片拼圖影像 22 2.3.2多片拼圖影像 23 2.3.3拼圖影像標註 25 2.3.4數據量之擴增 28 第三章 拼圖辨識 33 3.1 拼圖辨識模型 35 3.2 拼圖偵測 37 3.2.1拼圖偵測中的錨點 37 3.2.2錨框的向量配置 39 3.2.3 YOLO 演算法的訓練 41 3.3訓練拼圖辨識模型 41 3.3.1建立資料檔案 42 3.3.2劃分資料集 45 3.3.3定義網路結構 48 3.4訓練結果 51 3.4.1各模型之評估 51 3.4.2綜合討論 56 第四章 開發工具以及實例驗證 58 4.1軟體開發工具 58 4.2實例驗證 60 4.2.1 求取拼圖片的組裝位置 61 4.2.2 組裝拼圖片的方法 65 4.2.3 實例驗證1 70 4.2.4 實例驗證2 72 4.2.5 實例驗證結果 75 第五章 結論與未來研究方向 76 5.1結論 76 5.2未來研究方向 76 參考文獻 78

[1]Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016), “You Only Look Once: Unified, real-time object detection,” International Conference on Computer Vision and Pattern Recognition, June 27-30, 2016, Las Vegas, USA.
[2]Freeman, H. and Garder, L. (1964), “A pictorial jigsaw puzzles: The computer solution of a problem in pattern recognition,” International Journal of Electronic Computers, Vol. 13, No. 2, pp. 118 - 127.
[3]Burdeal, G.C. (1987), Two piece jigsaw puzzle robot assembly with vision, position and force feedback, Department of Mathematical Sciences, University of New York, New York, USA.
[4]Burdeal, G.C. and Wolfson, F. J. (1989), “Solving jigsaw puzzles by a robot,” International Journal of Robotics and Automation, Vol. 5, No. 6, pp. 752-764.
[5]Kosiba, D. A., Devaux, P. M., Gandhi, T. and Kasturi, R. (1994), “An automatic jigsaw puzzle solver,” International Conference on Computer Vision and Pattern Recognition, October 09-13, 1994, Jerusalem, Israel.
[6]Chung, M. G., Fleck, M. M. and Forsyth, D. A. (1998), “Jigsaw puzzle solver using shape and color,” International Conference on Signal Processing Proceedings, October 12-16, 1998, Beijing, China.
[7]Weiss-Cohen, M. and Halevi, Y. (2005), “Knowledge retrieval for automatic solving of jigsaw puzzles,” International Conference on Computational Intelligence for Modelling, November 28-30, 2005, Vienna, Austria.
[8]Fei, N., Zhuang, Fu., Liu, R., Cao., Q. and Zhao, Y. (2007), “An image processing approach for jigsaw puzzle assembly,” International Journal of Emerald Aseembly Automation, Vol. 27, No. 1, pp. 25-30.
[9]Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2014), “Rich feature hierarchies for accurate object detection and semantic segmentation,” International Conference on Computer vision and Pattern Recognition, October 22-24, 2014, New York, USA.
[10]Girshick, R. (2015), “Fast R-CNN,” International Conference on Computer vision and Pattern Recognition, April 27-30, 2015, New York, USA.
[11]Ren, S., He, K., Girshick, R. and Sun, J. (2016), “Faster R-CNN: Towards real-time object detection with region proposal networks,” International Conference on Neural Information Processing Systems, June 4-6, 2016, New York, USA.
[12]Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y. and Berg, A. C. (2016), “SSD: Single shot multibox detector,” International Journal of Computer Vision, Vol. 20, No. 2, pp. 21-37.
[13]He, K., Gkioxari, G., Dollár, P. and Girshick, R. (2017) “Mask R-CNN,” International Conference on Computer Vision, March 20-24 2017, New York, USA.

無法下載圖示
全文公開日期 本全文未授權公開 (校外網路)

QR CODE