研究生: |
邱建誠 Chien-Cheng Chyou |
---|---|
論文名稱: |
基於物件偵測網路之邊界框資料結構改良 Bounding Box Data Structure Improvement of Object Detection Network |
指導教授: |
王乃堅
Nai-Jian Wang |
口試委員: |
王乃堅
Nai-Jian Wang 呂學坤 Shyue-Kung Lu 郭景明 Jing-Ming Guo 鍾順平 Shun-Ping Chung 蘇順豐 Shun-Feng Su |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2018 |
畢業學年度: | 106 |
語文別: | 中文 |
論文頁數: | 37 |
中文關鍵詞: | 物件偵測 、類神經網路 、深度學習 |
外文關鍵詞: | Object Detection, Artificial Neural Network, Deep Learning |
相關次數: | 點閱:488 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
物件偵測(object detection)的目地是找出感興趣的物件種類,並標出物件的位置。常見的應用有機器視覺、工廠自動化、電動車等。傳統的物件偵測網路其物件邊界框的中心位置被限制在一定的範圍內,導致訓練時只有一個網格可以正確預測出一個物件。這種訓練方法限制住了訓練的方法,減少訓練出更好物件偵測網路的可能性。本論文提出新的物件邊界框資料結構,可以完全擺脫物件中心的限制,使多個網格都能預測同一物件。基於新的邊界框資料結構,本論文提出另一種訓練方法,此訓練方法有助於降低物件中心位於網格邊緣時的訓練難度,使網路模型的特徵能夠服務其他有難度的物件,因此能偵測出一些舊訓練方法會偵測不到的物件。除此之外,新的邊界框資料結構不但在舊的訓練方法中沒有副作用,還更能適應新的訓練方法。在舊的訓練方法中舊邊界框資料結構的邊界框重合度(intersection over union, IoU)為88.8%,新的邊界框資料結構則是89.4%;在新的訓練方法中,舊的邊界框資料結構的邊界框重合度則是87.9%,新的邊界框資料結構為89.6%。
Object detection aims to find the objects which people are interested in, and findthese objects’ position and category. The applications of object detection includemachine vision, factory automation, and electric car.The data structure of bounding box in traditional object detection network limitsobjects’ center in certain range. Therefore, only one grid can predict one objectcorrectly. This limits the training method, and may miss some good method to trainbetter object detection network. In this thesis, a new data structure of boundingbox is proposed, and make bounding box get rid of the limit of objects’ center.By applying this data structure of bounding box, many grid can predict one objectcorrectly. Based on th new data structure, a new training method is proposed in thisthesis. This training method helps to reduce the difficulty of training objects whosecenters are near boundary of grids. Then the feature of network model can serveother hard training object. Therefore, the proposed training method can detectsome objects that can’t be detected in old training method.What’s more, not only the new data structure of bounding box makes no sideeffect in old training method, but also adapt to new training method better. In oldtraining method, old data structure gets intersection over union(IoU) 88.8% withtest data, and new data structure gets IoU 89.4%. In new training method, old datastructure gets IoU 87.9%, and new data structure gets IoU 89.6%.
[1] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580–587.
[2] R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.
[3] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Advances in neural information processing systems, 2015, pp. 91–99.
[4] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition
, 2016, pp. 779–788.
[5] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in European conference on computer vision, Springer, 2016, pp. 21–37.
[6] T.-Y. Lin, P. Doll ́ar, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in CVPR, vol. 1, 2017, p. 4.
[7] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Doll ́ar, “Focal loss for dense object detection,”arXiv preprint arXiv:1708.02002, 2017.
[8] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
[9] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov,“Dropout: A simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.
[10] A. Krogh and J. A. Hertz, “A simple weight decay can improve generalization,” in Advances in neural information processing systems, 1992, pp. 950–957.
[11] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S.Ghemawat, G. Irving, M. Isard, et al., “Tensorflow: A system for large-scale machine learning.,” in OSDI, vol. 16, 2016, pp. 265–283.
[12] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes (voc) challenge,” International journal of computer vision, vol. 88, no. 2, pp. 303–338, 2010.
[13] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.37