研究生: |
張舒涵 Shu-Han Chang |
---|---|
論文名稱: |
以深度學習技術進行自動化拼圖辨識 Automated Puzzle Recognition Based on Deep Learning Techniques |
指導教授: |
林清安
Alan C. Lin |
口試委員: |
張復瑜
小林博仁 |
學位類別: |
碩士 Master |
系所名稱: |
工程學院 - 機械工程系 Department of Mechanical Engineering |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 中文 |
論文頁數: | 93 |
中文關鍵詞: | 拼圖辨識 、深度學習 、機器視覺 、YOLO演算法 、數據擴增 |
外文關鍵詞: | Puzzle recognition, Deep learning, Machine vision, YOLO algorithm, Data augmentation |
相關次數: | 點閱:666 下載:4 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
組裝一副完整的拼圖需要耗費相當多的時間和精力,隨著機器視覺和深度學習技術的快速發展,本論文嘗試將這兩項技術應用在自動化拼圖辨識,推動技術應用的創新層面。
本論文提出了一種基於深度學習的YOLO演算法,藉由演算法輸出的預測框求取拼圖類別、位置及旋轉量,來克服傳統使用影像處理技術進行拼圖辨識所面臨的問題。本論文主要研究事項包括:
(1)以旋轉方法自動化產生深度學習訓練所需的數據集,以利有效減輕數據標註和收集的時間,同時提供多樣化和龐大的數據量,從而改善深度學習模型的訓練和性能。
(2)建立一套拼圖辨識系統,利用反正切函數運算出隨機擺放拼圖片的類別、位置及旋轉量,使拼圖辨識達到自動化的功效。
除了提出自動化拼圖辨識的基礎研究方法以外,本論文亦以隨意擺放在工作台、但沒有重疊的9片拼圖為實例,測試所開發的拼圖辨識系統是否能自動化辨識每一片拼圖的類別、位置及旋轉量,並將零散的拼圖片進行重組,完成一副完整拼圖,結果驗證了系統在未重疊拼圖片的辨識上有相當高的成功率以及強健性,但其缺點是會受到預測框及拼圖類別的預測機率影響而產生些許方位判斷誤差。
Assembling a complete jigsaw puzzle requires a significant investment of time and effort. In light of the rapid advancements in machine vision and deep learning technologies, this thesis aims to explore their application in automating puzzle recognition, thereby fostering innovation in technological progress.
This thesis introduces a deep learning-based YOLO algorithm designed to overcome the challenges encountered by traditional image processing methods in puzzle recognition. The algorithm leverages predicted bounding boxes to ascertain the puzzle's type, position and rotation. The key research areas covered in this thesis include:
(1)Automated generation of training datasets using rotation methods, which effectively reduces the time needed for data annotation and collection. This approach provides diverse and extensive datasets that enhance the training and performance of deep learning models.
(2)Development of a puzzle recognition system that employs arctangent functions to compute the type, position and rotation of puzzle pieces placed randomly. This system achieves automation in puzzle recognition.
In addition to proposing fundamental research methods for automated puzzle recognition, this thesis presents a practical example utilizing nine non-overlapping puzzle pieces randomly positioned on a workbench. The developed puzzle recognition system is tested to automatically identify the type, position and rotation of each puzzle piece, leading to the successful reassembly of the scattered puzzle pieces into a complete puzzle. The results demonstrate a high success rate and robustness in recognizing non-overlapping puzzle pieces, albeit with some potential errors in positional judgment due to the influence of predicted bounding boxes and the probability of puzzle types.
[1]Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016), “You Only Look Once: Unified, real-time object detection,” International Conference on Computer Vision and Pattern Recognition, June 27-30, 2016, Las Vegas, USA.
[2]Freeman, H. and Garder, L. (1964), “A pictorial jigsaw puzzles: The computer solution of a problem in pattern recognition,” International Journal of Electronic Computers, Vol. 13, No. 2, pp. 118 - 127.
[3]Burdeal, G.C. (1987), Two piece jigsaw puzzle robot assembly with vision, position and force feedback, Department of Mathematical Sciences, University of New York, New York, USA.
[4]Burdeal, G.C. and Wolfson, F. J. (1989), “Solving jigsaw puzzles by a robot,” International Journal of Robotics and Automation, Vol. 5, No. 6, pp. 752-764.
[5]Kosiba, D. A., Devaux, P. M., Gandhi, T. and Kasturi, R. (1994), “An automatic jigsaw puzzle solver,” International Conference on Computer Vision and Pattern Recognition, October 09-13, 1994, Jerusalem, Israel.
[6]Chung, M. G., Fleck, M. M. and Forsyth, D. A. (1998), “Jigsaw puzzle solver using shape and color,” International Conference on Signal Processing Proceedings, October 12-16, 1998, Beijing, China.
[7]Weiss-Cohen, M. and Halevi, Y. (2005), “Knowledge retrieval for automatic solving of jigsaw puzzles,” International Conference on Computational Intelligence for Modelling, November 28-30, 2005, Vienna, Austria.
[8]Fei, N., Zhuang, Fu., Liu, R., Cao., Q. and Zhao, Y. (2007), “An image processing approach for jigsaw puzzle assembly,” International Journal of Emerald Aseembly Automation, Vol. 27, No. 1, pp. 25-30.
[9]Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2014), “Rich feature hierarchies for accurate object detection and semantic segmentation,” International Conference on Computer vision and Pattern Recognition, October 22-24, 2014, New York, USA.
[10]Girshick, R. (2015), “Fast R-CNN,” International Conference on Computer vision and Pattern Recognition, April 27-30, 2015, New York, USA.
[11]Ren, S., He, K., Girshick, R. and Sun, J. (2016), “Faster R-CNN: Towards real-time object detection with region proposal networks,” International Conference on Neural Information Processing Systems, June 4-6, 2016, New York, USA.
[12]Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y. and Berg, A. C. (2016), “SSD: Single shot multibox detector,” International Journal of Computer Vision, Vol. 20, No. 2, pp. 21-37.
[13]He, K., Gkioxari, G., Dollár, P. and Girshick, R. (2017) “Mask R-CNN,” International Conference on Computer Vision, March 20-24 2017, New York, USA.