簡易檢索 / 詳目顯示

研究生: 吳佳霖
Chia-Lin Wu
論文名稱: 自動化取像系統研製與深度學習之商品辨識應用
Implementation of Automated Imaging System and Applications for Deep Learning Based Product Recognition
指導教授: 蔡明忠
Ming-Jong Tsai
口試委員: 李振豪
Chen-Hao Li
郭永麟
Yong-Lin Kuo
李敏凡
Min-Fan Lee
學位類別: 碩士
Master
系所名稱: 工程學院 - 自動化及控制研究所
Graduate Institute of Automation and Control
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 109
中文關鍵詞: 深度學習影像處理取像機構物件辨識YOLO
外文關鍵詞: Deep Learning, Image Processing, Imaging System, Object Recognition, YOLO
相關次數: 點閱:474下載:13
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

近年來,影像辨識與深度學習逐漸應用於物流業、零售業等等領域,尤其是隨著服務的需求提高,將其導入市場來協助店鋪營運與提升消費者消費的便利性,而這些可節省人力之影像辨識應用也是未來發展的趨勢。本研究以製作自動翻轉取像機構並將影像處理與深度學習應用於商品辨識。其分為三個階段:第一階段為自動翻轉取像系統,在此階段製作出可自動旋轉、翻轉商品並拍攝之機構,而取像商品則選用不同外觀特徵,如形狀、顏色等等進行取像,以5類19個商品的不同姿勢拍攝約可得到20,392張影像;第二階段為影像前處理,將第一階段取得之所有影像進行影像前處理,可自動得到含有商品在影像中位置資訊的標記檔,內容包括商品標籤名稱之索引及商品相對於影像之位置比例,每處理一張並產生標記檔約需6.5秒;第三階段為進行兩版本YOLO演算法來訓練與測試商品辨識結果,發現在前面階段得到的單一影像集中在加入少量手動標註的複雜影像,即可使YOLOv4訓練模型的測試集mAP達到95.44%。根據實驗結果可知,本論文所設計之自動翻轉取像機構可以節省收集訓練資料影像集的人力及減少時間成本,並且還可使訓練模型對商品的辨識精確度有良好的影像辨識效果。


In recent years, image recognition and deep learning has gradually been applied to the logistics industry, and retail industry etc. Especially as the demand for services increases, it is introduced into market to assist store operations and improve the convenience of consumer consumption. Manpower-saving image recognition applications are also the trend of future development. This research is to make an automatic flip imaging mechanism. Then, image processing and deep learning are applied on product recognition. It is divided into three stages: The first stage is an automatic flip imaging system, in which a product can be automatically rotated, flipped and photographed. The sampled products are imaged with different appearance features, such as shape, color, etc. Approximate 20,392 images can be obtained by shooting 19 samples from 5 kinds of products with different posture. The second stage is image pre-processing for images obtained in the first stage. All the images can be automatically labelled. The position information of the product, including the index of the product label name and the position ratio of the product relative to the image will be recorded. Each processed one will generate a label file. It takes 6.5 seconds for each image. The third stage is to train two versions of YOLO algorithms and test the product identification results. It is found that the single image obtained in the previous stage is concentrated in a small number of manually annotated complex images, which can make the mAP of test set under the Yolov4 training model to 95.44 %. According to the experimental results, the automatic flip imaging system designed in this thesis can save the labor and reduce time cost of collecting training data image sets. It can also make the training model have a good image recognition effect on the accuracy of product recognition.

致謝 I 摘要 II ABSTRACT III 目錄 IV 圖目錄 VI 表目錄 IX 第一章 緒論 1 1.1 前言 1 1.2 研究動機與目的 1 1.3 研究方法 2 1.4 相關文獻 3 1.5 本文架構 3 第二章 相關技術 5 2.1 影像處理 5 2.1.1 色彩空間 5 2.1.2 形態學 6 2.1.3 Canny邊緣檢測 8 2.2 深度學習 13 2.3 模型評估-驗證指標 17 2.3.1精確度及召回率 17 2.3.2 平均精確度及平均精確度均值 18 2.3.3 IoU 19 2.3.4 非最大抑制 20 2.4 YOLO 22 2.4.1 YOLOv1 22 2.4.2 YOLOv2 25 2.4.3 YOLOv3 29 2.4.4 YOLOv4 33 第三章 自動取像系統設計與製作應用 40 3.1 實驗方法 40 3.2 自動取像系統之架構 43 3.2.1 取像機構之硬體介紹 46 3.2.2 取像機構之軟體控制 54 3.3 程式架構 54 3.3.1 自動翻轉取像系統 54 3.2.2 影像前處理 56 3.3.3 資料訓練 59 第四章 實驗結果與討論 60 4.1 自動翻轉取像機構功能測試 60 4.2 自動翻轉取像系統取像時間結果 62 4.3 取像影像前處理之測試結果 67 4.4 YOLO模型訓練結果 74 4.5 YOLO v4模型驗證結果 80 4.5.1模型與優化結果比較 80 4.5.2優化模型對多商品影像辨識結果 85 第五章 結論與未來展望 89 5.1 結論 89 5.2 未來展望 90 參考文獻 92

[1] J. Krueger, J. Lehr, M. Schlueter, and N. Bischoff, "Deep learning for part identification based on inherent features," CIRP Annals, vol. 68, no. 1, pp. 9-12, Apr. 2019, DOI: 10.1016/j.cirp.2019.04.095.
[2] R. Gai, N. Chen, and H. Yuan, "A detection algorithm for cherry fruits based on the improved YOLO-v4 model," Neural Computing and Applications, pp. 1-12, May. 2021, DOI: 10.1007/s00521-021-06029-z.
[3] A. Koirala, K. Walsh, Z. Wang, and C. McCarthy, "Deep learning for real-time fruit detection and orchard fruit load estimation: Benchmarking of MangoYOLO," Precision Agriculture, vol. 20, no. 6, pp. 1107-1135, Feb. 2019, DOI: 10.1007/s11119-019-09642-0.
[4] G. Liu, J.C. Nouaze, P.L. Touko Mbouembe, and J.H. Kim, "YOLO-tomato: A robust algorithm for tomato detection based on YOLOv3," Sensors, vol. 20, no. 7, pp. 2145, Apr. 2020, DOI: 10.3390/s20072145.
[5] Y. Tian, G. Yang, Z. Wang, H. Wang, E. Li, and Z. Liang, "Apple detection during different growth stages in orchards using the improved YOLO-V3 model," Computers and Electronics in Agriculture, vol. 157, pp. 417-426, Feb. 2019, DOI:10.1016/j.compag.2019.01.012.
[6] A. Franco, D. Maltoni, and S. Papi, "Grocery product detection and recognition", Expert Systems with Applications, vol. 81, pp. 163-176, Sep. 2017, DOI: 10.1016/j.eswa.2017.02.050.
[7] M. Marder, S. Harary, A. Ribak, Y. Tzur, S. Alpert, and A. Tzadok, "Using image analytics to monitor retail store shelves," IBM Journal of Research and Development, vol. 59, no. 2/3, pp. 3:1-3:11, Apr. 2015. DOI: 10.1147/JRD.2015.2394513.
[8] A. Tonioni, E. Serra, and L. Di Stefano, "A deep learning pipeline for product recognition on store shelves." in 2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS), Sophia Antipolis, France, 12-14 Dec. 2018, pp. 25-31. New York,USA: IEEE.
[9] B.-F. Wu, W.-J. Tseng, Y.-S. Chen, S.-J. Yao, and P.-J. Chang, "An intelligent self-checkout system for smart retail". in 2016 International Conference on System Science and Engineering (ICSSE), Puli, Taiwan, 7-9 Jul. 2016, pp.1-4. New York, USA: IEEE.
[10] R.C. Gonzalez and R.E. Woods, "Digital Image Processing (3rd Edition)." USA: Prentice-Hall, Inc, 2006.
[11] S. Kolkur, D. Kalbande, P. Shimpi, C. Bapat, and J. Jatakia, "Human skin detection using RGB, HSV and YCbCr color models," in Proceeding of the International Conference on Communication and Signal Processing 2016, Melmaruvathur, India, 26-27 Dec. 2016, pp.324-332. Netherland: Atlanis Press
[12] L. Shuhua and G. Gaizhi, "The application of improved HSV color space model in image processing". in 2010 2nd International Conference on Future Computer and Communication, Wuhan, China, 21-24 May 2010, pp.v2-10-v2-13. US: IEEE.
[13] M. Goyal, "Morphological image processing," IJCST, vol. 2, no. 4, pp.161-165, Dec 2011.
[14] X. Zhang, "Gaussian Distribution", in Encyclopedia of Machine Learning, C. Sammut and G.I. Webb, Ed. US: Springer, 2010, p. 425-428
[15] J. Canny, "A computational approach to edge detection," IEEE Transactions on pattern Analysis and Machine Intelligence, no. 6, pp. 679-698, 1986.
[16] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, "Deep learning," USA: MIT Press, 2016.
[17] J.R. Searle, "Minds, brains, and programs," Behavoral and Brain Sciences, pp. 417-424, Sep. 1980. DOI:10.1017/S0140525X00005756.
[18] M. Mohri, A. Rostamizadeh, and A. Talwalkar, "Foundations of machine learning". 2018: MIT press.
[19] K. O' Shea and R. Nash, "An introduction to convolutional neural networks" Nov. 2015.
[20] R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation." in Proceedings of the IEEE conference on computer vision and pattern recognition, Columbus, OH, USA, 23-28 Jun. 2014, pp.580-587. USA: IEEE.
[21] R. Girshick, "Fast R-CNN." in 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7-13 Dec. 2015. USA: IEEE.
[22] K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask R-CNN." in 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22-29 Oct. 2017, pp.2980-2988. USA: IEEE.
[23] A. Bochkovskiy, C.-Y. Wang, and H.-Y.M. Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection." pp.1-17. 2020.
[24] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection." in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27-30 Jun. 2016, pp.779-788. USA: IEEE.
[25] J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger." in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21-26 Jul. 2017, pp.6517-6525. USA: IEEE.
[26] J. Redmon and A. Farhadi, "YOLOv3: An Incremental Improvement." pp.1-6, Apr. 2018.
[27] C. Goutte and E. Gaussier, "A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation." in Proceeding of the 27th European Conference on Advances in Information Retrival Research, Santiago de Compostela, Spain, 21-23 Mar. 2005, pp.345-359. Springer: Germany.
[28] M. Everingham, L. Van Gool, C.K. Williams, J. Winn, and A. Zisserman, "The pascal visual object classes (voc) challenge," International Journal of Computer Vision, vol. 88, no. 2, pp. 303-338, 2010, DOI: 10.1007/s11263-009-0275-4.
[29] P. Henderson and V. Ferrari, "End-to-End Training of Object Class Detectors for Mean Average Precision." in Asian Conference on Computer Vision – ACCV 2016, Taipei, Taiwan, 20-24 Nov. 2016, pp. Cham: Springer.
[30] F. van Beers, A. Lindström, E. Okafor, and M.A. Wiering, "Deep Neural Networks with Intersection over Union Loss for Binary Image Segmentation". in Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods - ICPRAM, Prague, Czech Republic, 2019, pp.438-445.
[31] N. Bodla, B. Singh, R. Chellappa, and L.S. Davis, "Soft-NMS--improving object detection with one line of code." in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy , 22-29 Oct. 2017, pp.5561-5569. USA: IEEE
[32] Y. Tian, G. Yang, Z. Wang, E. Li, and Z. Liang, "Detection of apple lesions in orchards based on deep learning methods of cyclegan and yolov3-dense," Journal of Sensors, vol. 2019, pp.1-14. Apr. 2019, DOI: 10.1155/2019/7630926.
[33] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, "Feature pyramid networks for object detection". in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21-26 July 2017, pp.936-944. USA: IEEE.
[34] C. Wang, Q. Luo, X. Chen, B. Yi, and H. Wang, "Citrus recognition based on YOLOv4 neural network." in 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), Nachang, China, 26-28 Mar. 2021, pp.564-569. USA: IEEE.
[35] D. Misra, "Mish: A self regularized non-monotonic activation function," 2019, pp.1-14.
[36] G. Ghiasi, T.-Y. Lin, and Q.V. Le, "Dropblock: A regularization method for convolutional networks," 2018, pp.1-11.
[37] Z. Huang, J. Wang, X. Fu, T. Yu, Y. Guo, and R. Wang, "DC-SPP-YOLO: Dense connection and spatial pyramid pooling based YOLO for object detection," Information Sciences, vol. 522, pp. 241-258, Jun. 2020. DOI: 10.1016/j.ins.2020.02.067.
[38] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, "Path aggregation network for instance segmentation." in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18-23 Jun. 2018, pp.8759-8767. USA: IEEE.
[39] Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, "Distance-IoU loss: Faster and better learning for bounding box regression." Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 7, Apr. 2020 pp. 12993-13000. DOI: 10.1609/aaai.v34i07.6999.
[40] 曹永忠、許智誠、蔡英德, "Arduino 程式教學(顯示模組篇)", 2016, Available:http
://ebooks.lib.ntu.edu.tw/1_file/ntulib/106031578/78-%E9%A1%AF%E7%A4%BA%
E6%A8%A1%E7%B5%84%E6%95%99%E5%AD%B8V10%E5%AE%8C%E6%95%B4%E7%89%88.pdf.
[41] SM-42HB34F08AB, Available:https://static.chipdip.ru/lib/879/DOC003879911.pdf.
[42] HMS-23KMK346P1V, Available:https://shop.cpu.com.tw/product/56816/pdf2/.
[43] A4988, Available:https://www.pololu.com/file/0J450/a4988_DMOS_microstepping_
driver_with_translator.pdf.
[44] Arduino CNC Shield, Available:https://www.openimpulse.com/blog/wp-content/uplo
ads/wpsc/downloadables/Arduino-CNC-Shield.pdf.
[45] Logitech_C922_Pro_Stream, Available:https://www.logitech.com/zh-tw/products/we
bcams/c922-pro-stream-webcam.960-001091.html.
[46] P. Desai、CAVEDU團隊、曾吉弘, "Python x Arduino 物聯網整合開發實戰", 碁峰資訊, 2016.
[47] G. Bradski and A. Kaehler, "Learning OpenCV: Computer vision with the OpenCV library". 2008: " O'Reilly Media, Inc.".

QR CODE