研究生: |
游凱安 Kai-An Yu |
---|---|
論文名稱: |
針對室內個人化影像伺服操作物件之深度學習分類器開發 Development of a Deep Learning Classification Approach for Indoor Personal Image Servo Manipulation Objects |
指導教授: |
郭重顯
Chung-Hsien Kuo |
口試委員: |
宋開泰
Kai-Tai Song 林其禹 Chyi-Yeu Lin 蘇順豐 Shun-Feng Su 徐繼聖 Gee-Sern Hsu |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 60 |
中文關鍵詞: | 深度學習 、基於區域之卷積神經網路 、YOLO v2 、室內物件辨識 、電梯按鍵辨識 |
外文關鍵詞: | deep learning, region-based CNN, YOLO v2, indoor object recognition, elevator button recognition |
相關次數: | 點閱:345 下載:1 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文提出一應用於室內個人化影像伺服操作物件之深度學習分類器。為了增進脊髓損傷患者的自主生活能力,本論文針對使用者於室內最需要的兩個場景進行設計,分別為「室內物件辨識」與「電梯場景按鍵辨識」。為了達成上述之目標,本論文以YOLOv2深度學習網路為基礎進行改良。其中,在室內物件辨識場景中,提出一結合「ImageNet」、「PASCAL-VOC」與「自行蒐集之資料」之混合式資料庫;在電梯場景之按鍵辨識中,為了避免傳統上使用單一層網路因按鍵內容過小而無法辨識之問題,本研究提出一創新式「雙層辨識網路」,在第一層的辨識網路中判斷出按鍵位置後,使用影像擷取按鍵區域圖像並經過影像強化技術,做為第二層按鍵內容辨識網路之輸入。除此之外,為了補償深度學習網路在實際應用時可能的缺漏,本研究最後提出一知識庫按鍵推論引擎提升按鍵的辨識率。根據實驗結果,在真實場景中之室內物件辨識,可達到66.08%辨識率;在電梯按鍵辨識部分,在不使用知識庫按鍵推論引擎與影像強化技術下,其辨識率為64.53%,使用影像強化技術後,辨識率提升至66.31%。最後導入知識庫推論引擎後,按鍵辨識率提升至81.08%。
This thesis proposes a deep learning object recognition classifier which was used for personal manipulation in the indoor environment. In order to improve the living ability of people who suffer from spinal cord injuries (SCIs), this thesis is set as two main scenarios which are “indoor object recognition” and “elevator button recognition”. To achieve the above purposes, a custom modification based on the YOLO v2 (You Only Look Once) with the state-of-art region-based convolutional neural network was proposed. In the indoor object recognition scenario, the merging database from the ImageNet dataset, PASCAL VOC (Visual Object Classes) dataset and self-collection images were collected. In the elevator button recognition scenario, in order to solve the problem when the button content is too small to be detected by the one-layer network, this thesis proposed a novel “double-layers recognition classifier” which was realized with detecting the button in the first layer, and then recognize the content in the following second layer. In addition to the structure modification, a Knowledge-based reasoning (KBR) approach was also applied to compensate the missing button and content. The result showed that the proposed method could achieve average 64.53% accuracy in the indoor object recognition. In the button detection tasks, with and without the KBR button inference engine, could reach 66.31% and 81.08% respectively.
[1] Y. Liu, X. Lin, Q. Zhang and E. Izquierdo, “Improved indoor scene geometry recognition from single image based on depth map,” 11th IVMSP Workshop, 2013.
[2] C. Du, C. Zeng, F. Xu and H. Liang, “The real-time segmentation of indoor scene based on RGB-D sensor,” 2014 IEEE International Conference on Robotics and Biomimetics, 2014.
[3] M. Hannat, N. Zrira, Y. Raoui and E. Bouyakhf, “A fast object recognition and categorization technique for robot grasping using the visual bag of words,” 2016 5th International Conference on Multimedia Computing and Systems, 2016.
[4] W. J. Wang, C. H. Huang, I. H. Lai and H. C. Chen, “A robot arm for pushing elevator buttons,” 2010 Proceedings of SICE Annual Conference, 2010
[5] H. Kim, D. Kim and K. Park, “Robust elevator button recognition in the presence of partial occlusion and clutter by specular reflections,” IEEE Transactions on Industrial Electronics, vol. 59, issue. 3, pp. 1597-1611, 2012.
[6] Q. Dou, H. Chen, L. Yu, L. Zhao, J. Qin, D. Wang, V. Mok, Lin Shi and P.A. Heng, “Automatic detection of cerebral microbleeds from MR images via 3D convolutional neural networks,” IEEE Transactions on Medical Imaging, vol. 35, issue. 5, pp. 1182-1195, 2016.
[7] J. Wang, H. Ding, F. A. Bidgoli, B. Zhou, C. Iribarren, S. Molloi and Pierre Baldi, “Detecting cardiovascular disease from mammograms with deep learning,” IEEE Transactions on Medical Imaging, vol. 5, issue. 5, pp. 1172-1181, 2017.
[8] S. Nagpal, M. Singh, R. Singh and M. Vatsa, “Regularized deep learning for face recognition with weight variations,” IEEE Access, vol. 3, pp. 3010 – 3018, 2015.
[9] Y. Sun, X. Wang and X. Tang, “Hybrid deep learning for face verification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, issue. 10, pp. 1997-2009, 2016.
[10] H. Li, Z. Wu and J. Zhang, “Pedestrian detection based on deep learning model,” 2016 9th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics, 2016.
[11] C. Balchandani, R. K. Hatwar, P. Makkar, Y. Shah, P. Yelure and M. Eirinaki, “A deep learning framework for smart street cleaning,” 2017 IEEE 3rd International Conference on Big Data Computing Service and Applications, 2017.
[12] R. Polishetty, M. Roopaei and P. Rad, “A next-generation secure cloud-based deep learning license plate recognition for smart cities,” 2016 15th IEEE International Conference on Machine Learning and Applications, 2016.
[13] R. Girshick, J. Donahue and T. Darrell, J. Malik, “Rich feature hierarchies for object detection and semantic segmentation,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
[14] J. Uijilings, K. Sande, T. Gevers and A. Smeulders, “Selective search for object recognition,” International Journal of Computer Vision, vol. 104, no. 2, pp. 154-171, 2013.
[15] R. Girshick, “Fast R-CNN,” 2015 IEEE International Conference on Computer Vision, 2015.
[16] S. Ren, K. He, R. Girshick and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, issue. 6, pp. 1137-1149, 2017.
[17] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, “You only look once: unified, real-time object detection,” 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[18] M.A. Sadeghi and D. Forsyth, “30hz object detection with dpm v5,” European Conference on Computer Vision, pp. 65-79, 2014.
[19] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, “Going deeper with convolutions,” 2015 IEEE Conference on Computer Vision and Pattern Recognition, 2015.
[20] M. Lin, Q. Chen and S. Yan, “Network in Network,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
[21] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016
[22] A. Krizhevsky, I. Sutskever and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, pp.1097-1105, 2012.
[23] 蘇浩平,「開放環境下之車牌偵測」,碩士論文,國立臺灣科技大學,民國106年。