簡易檢索 / 詳目顯示

研究生: 游凱安
Kai-An Yu
論文名稱: 針對室內個人化影像伺服操作物件之深度學習分類器開發
Development of a Deep Learning Classification Approach for Indoor Personal Image Servo Manipulation Objects
指導教授: 郭重顯
Chung-Hsien Kuo
口試委員: 宋開泰
Kai-Tai Song
林其禹
Chyi-Yeu Lin
蘇順豐
Shun-Feng Su
徐繼聖
Gee-Sern Hsu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 60
中文關鍵詞: 深度學習基於區域之卷積神經網路YOLO v2室內物件辨識電梯按鍵辨識
外文關鍵詞: deep learning, region-based CNN, YOLO v2, indoor object recognition, elevator button recognition
相關次數: 點閱:345下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

本論文提出一應用於室內個人化影像伺服操作物件之深度學習分類器。為了增進脊髓損傷患者的自主生活能力,本論文針對使用者於室內最需要的兩個場景進行設計,分別為「室內物件辨識」與「電梯場景按鍵辨識」。為了達成上述之目標,本論文以YOLOv2深度學習網路為基礎進行改良。其中,在室內物件辨識場景中,提出一結合「ImageNet」、「PASCAL-VOC」與「自行蒐集之資料」之混合式資料庫;在電梯場景之按鍵辨識中,為了避免傳統上使用單一層網路因按鍵內容過小而無法辨識之問題,本研究提出一創新式「雙層辨識網路」,在第一層的辨識網路中判斷出按鍵位置後,使用影像擷取按鍵區域圖像並經過影像強化技術,做為第二層按鍵內容辨識網路之輸入。除此之外,為了補償深度學習網路在實際應用時可能的缺漏,本研究最後提出一知識庫按鍵推論引擎提升按鍵的辨識率。根據實驗結果,在真實場景中之室內物件辨識,可達到66.08%辨識率;在電梯按鍵辨識部分,在不使用知識庫按鍵推論引擎與影像強化技術下,其辨識率為64.53%,使用影像強化技術後,辨識率提升至66.31%。最後導入知識庫推論引擎後,按鍵辨識率提升至81.08%。


This thesis proposes a deep learning object recognition classifier which was used for personal manipulation in the indoor environment. In order to improve the living ability of people who suffer from spinal cord injuries (SCIs), this thesis is set as two main scenarios which are “indoor object recognition” and “elevator button recognition”. To achieve the above purposes, a custom modification based on the YOLO v2 (You Only Look Once) with the state-of-art region-based convolutional neural network was proposed. In the indoor object recognition scenario, the merging database from the ImageNet dataset, PASCAL VOC (Visual Object Classes) dataset and self-collection images were collected. In the elevator button recognition scenario, in order to solve the problem when the button content is too small to be detected by the one-layer network, this thesis proposed a novel “double-layers recognition classifier” which was realized with detecting the button in the first layer, and then recognize the content in the following second layer. In addition to the structure modification, a Knowledge-based reasoning (KBR) approach was also applied to compensate the missing button and content. The result showed that the proposed method could achieve average 64.53% accuracy in the indoor object recognition. In the button detection tasks, with and without the KBR button inference engine, could reach 66.31% and 81.08% respectively.

延後公開申請書 II 指導教授推薦書 III 口試委員會審定書 IV 誌謝 V 摘要 VI ABSTRACT VII 目錄 VIII 表目錄 XI 圖目錄 XII 字彙表 XIV 第一章 緒論 1 1.1 研究背景與動機 1 1.1 研究目的 2 1.2 研究貢獻 2 1.3 論文架構 3 1.4 文獻回顧 4 第二章 深度學習物件影像偵測演算法 7 2.1 卷積類神經網路 8 2.2 基於區域之卷積類神經網路 10 2.3 即時物件偵測演算法(YOLO) 11 2.3.1 一體化物件檢測流程 12 2.3.2 YOLO v1網路訓練 13 2.3.3 YOLO v1的缺點與限制 14 2.4 即時物件偵測演算法(YOLO V2) 15 2.5 深度學習訓練資料庫與效能分析 21 2.5.1 深度學習分類器評估參數 21 2.5.2 訓練資料庫介紹 23 第三章 室內個人化影像伺服操作之深度學習分類器 24 3.1 室內物件辨識分類器設計 26 3.1.1 室內物件資料庫蒐集 26 3.1.2 YOLO v2室內物件深度學習分類器 28 3.2 室內電梯按鈕辨識分類器設計 31 3.2.1 電梯按鍵分類器設計考量 31 3.2.2 雙層辨識網路架構 32 3.2.3 電梯按鍵資料庫蒐集 32 3.2.4 第一層按鍵辨識網路設計 34 3.2.5 影像抽離與影像強化前處理 35 3.2.6 第二層按鍵辨識網路設計 35 3.3 知識庫按鍵推論引擎設計 37 3.3.1. 推論引擎設計考量 37 3.3.2. 電梯車廂外部按鍵推論引擎 38 3.3.3. 電梯車廂內部按鍵推論引擎 39 3.4 物件空間座標量測 41 第四章 實驗結果與討論 43 4.1 實驗環境配置 43 4.2 實驗流程設計 44 4.3 室內物件辨識準確度實驗 45 4.3.1 使用資料庫驗證分類器模型 45 4.3.2 不同背景複雜度之真實場景辨識實驗 46 4.4 電梯按鍵辨識準確度實驗 52 4.4.1 使用資料庫驗證分類器模型 52 4.4.2 電梯按鍵雙層辨識網路準確度分析 55 第五章 結論與未來研究方向 57 參考文獻

[1] Y. Liu, X. Lin, Q. Zhang and E. Izquierdo, “Improved indoor scene geometry recognition from single image based on depth map,” 11th IVMSP Workshop, 2013.
[2] C. Du, C. Zeng, F. Xu and H. Liang, “The real-time segmentation of indoor scene based on RGB-D sensor,” 2014 IEEE International Conference on Robotics and Biomimetics, 2014.
[3] M. Hannat, N. Zrira, Y. Raoui and E. Bouyakhf, “A fast object recognition and categorization technique for robot grasping using the visual bag of words,” 2016 5th International Conference on Multimedia Computing and Systems, 2016.
[4] W. J. Wang, C. H. Huang, I. H. Lai and H. C. Chen, “A robot arm for pushing elevator buttons,” 2010 Proceedings of SICE Annual Conference, 2010
[5] H. Kim, D. Kim and K. Park, “Robust elevator button recognition in the presence of partial occlusion and clutter by specular reflections,” IEEE Transactions on Industrial Electronics, vol. 59, issue. 3, pp. 1597-1611, 2012.
[6] Q. Dou, H. Chen, L. Yu, L. Zhao, J. Qin, D. Wang, V. Mok, Lin Shi and P.A. Heng, “Automatic detection of cerebral microbleeds from MR images via 3D convolutional neural networks,” IEEE Transactions on Medical Imaging, vol. 35, issue. 5, pp. 1182-1195, 2016.
[7] J. Wang, H. Ding, F. A. Bidgoli, B. Zhou, C. Iribarren, S. Molloi and Pierre Baldi, “Detecting cardiovascular disease from mammograms with deep learning,” IEEE Transactions on Medical Imaging, vol. 5, issue. 5, pp. 1172-1181, 2017.
[8] S. Nagpal, M. Singh, R. Singh and M. Vatsa, “Regularized deep learning for face recognition with weight variations,” IEEE Access, vol. 3, pp. 3010 – 3018, 2015.
[9] Y. Sun, X. Wang and X. Tang, “Hybrid deep learning for face verification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, issue. 10, pp. 1997-2009, 2016.
[10] H. Li, Z. Wu and J. Zhang, “Pedestrian detection based on deep learning model,” 2016 9th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics, 2016.
[11] C. Balchandani, R. K. Hatwar, P. Makkar, Y. Shah, P. Yelure and M. Eirinaki, “A deep learning framework for smart street cleaning,” 2017 IEEE 3rd International Conference on Big Data Computing Service and Applications, 2017.
[12] R. Polishetty, M. Roopaei and P. Rad, “A next-generation secure cloud-based deep learning license plate recognition for smart cities,” 2016 15th IEEE International Conference on Machine Learning and Applications, 2016.
[13] R. Girshick, J. Donahue and T. Darrell, J. Malik, “Rich feature hierarchies for object detection and semantic segmentation,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
[14] J. Uijilings, K. Sande, T. Gevers and A. Smeulders, “Selective search for object recognition,” International Journal of Computer Vision, vol. 104, no. 2, pp. 154-171, 2013.
[15] R. Girshick, “Fast R-CNN,” 2015 IEEE International Conference on Computer Vision, 2015.
[16] S. Ren, K. He, R. Girshick and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, issue. 6, pp. 1137-1149, 2017.
[17] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, “You only look once: unified, real-time object detection,” 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[18] M.A. Sadeghi and D. Forsyth, “30hz object detection with dpm v5,” European Conference on Computer Vision, pp. 65-79, 2014.
[19] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, “Going deeper with convolutions,” 2015 IEEE Conference on Computer Vision and Pattern Recognition, 2015.
[20] M. Lin, Q. Chen and S. Yan, “Network in Network,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
[21] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016
[22] A. Krizhevsky, I. Sutskever and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, pp.1097-1105, 2012.
[23] 蘇浩平,「開放環境下之車牌偵測」,碩士論文,國立臺灣科技大學,民國106年。

無法下載圖示 全文公開日期 2022/08/16 (校內網路)
全文公開日期 本全文未授權公開 (校外網路)
全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
QR CODE