開放環境下之路標偵測｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	林世勳 Shih-Hsun Lin
論文名稱：	開放環境下之路標偵測 Road Sign Detection in the Wild
指導教授：	郭永麟 Yong-Lin Kuo
口試委員:	楊振雄 Cheng‐Hsiung Yang 徐勝均 Sheng-Dong Xu 張以全 Peter I-Tsyuen Chang 郭永麟 Yong-Lin Kuo
學位類別：	碩士 Master
系所名稱：	工程學院 - 自動化及控制研究所 Graduate Institute of Automation and Control
論文出版年：	2019
畢業學年度：	107
語文別：	中文
論文頁數：	122
中文關鍵詞：	路標偵測、深度學習
外文關鍵詞：	road sign detection, deep learning
相關次數：	點閱：795 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

道路標誌(Road Sign)簡稱路標(又稱道路指示牌Signpost)是道路上告知目前位置與指引目的地方向的重要資訊。路標可以讓人即使身處在一個陌生環境中，快速辨識位置並且指引正確的目標方向。目前雖然有很多利用深度學習方法進行物體辨識的相關研究，但是鮮少針對路標進行辨識的模型，導致目前若要實現路標辨識功能，在實務上並沒有現有模型可以參考使用。
本研究為極少數運用最新的深度學習方法應用於路標偵測。首先蒐集在路上常見的路標為資料圖片，建置一套含有直式與橫式路標標註的影像資料數據圖庫，並依據圖檔的畫面內容，將圖庫資料分為三大類:過亮以EV-H表示、正常以EV-M表示、雨天/黑夜以EV-L表示，供影像資料辨識應用中使用。接著採用目前成熟之物體辨識功能，使用你只看一次(You Only Look Once, YOLO)即時偵測系統的方法第二版(YOLO v2)、第三版(YOLO v3)和單步多目標檢測器(SSD)即時偵測系統分析等方法進行模型訓練。且在YOLO v2的方法中，採取將影像劃分為不同數量之網格(5×5、11×11、21×21)與重新依資料數據圖庫計算錨點框(anchor boxes)值的改善方法以進行額外多種的對照比較。經過驗證集篩選出各組最佳模型，接著使用不同條件之測試集(較亮EV-H、正常EV-M、雨天/黑夜EV-L)比較辨識結果，找出最佳之模型。
最後經由分析與比較最終辨識的效果，提出YOLO v2、YOLO v3和 SSD 的路標偵測版本，提供幾種模型可以辨識出路上常見路標，並能即時進行行車紀錄資料中之主要路標辨識。並且得到以下結論:(1)在YOLO類的模型以Yolov2-lss21的表現較佳; 在SSD類的以SSD 300的表現較佳。(2)在YOLO類的模型會因為(過亮EV-H、正常EV-M、雨天/黑夜EV-L) 測試集的差異影響辨識率，但是SSD類的模型就較不易受到影響。(3)結合實驗結果之平均辨識率(mAP)以及影格率(Frame per Second, FPS)結果，Yolov2-lss21與SSD 300皆為可以為建議使用的模型。

Road Sign, referred to as the Signpost, is an important piece of information on the road to inform the current location and direction of the destination. Road signs allow people to quickly identify locations and guide the correct target direction, even in an unfamiliar environment. At present, there are many related researches on object recognition using deep learning methods, yet there are few models for identifying road signs. As a result, if the road sign recognition function is to be implemented, there is no existing model for practical use nowadays.
In this study, we explore and customize state-of-the-art detection approaches for road sign detection. First, collect the common road signs on the road as data pictures, build a set of image database containing vertical and horizontal road signs, and divide the library data into three categories according to the content of the picture files: over bright ones labeled as EV-H, normally exposed as EV-M, and rainy/night as EV-L, for use in image data identification applications. Then adopt the current mature object recognition functions, using the You Only Look Once (YOLO) instant detection system version 2 (YOLO v2), version 3 (YOLO v3) and single-step multi-target detector (SSD) Instant Detection System Analysis for model training. In the YOLO v2 method, an improved method of dividing the image into different numbers of meshes (5×5, 11×11, 21×21) and recalculating the value of the anchor boxes according to the data database is adopted. Additional multiple comparisons were made. After the verification processes, the best model of each group was selected, and then tested these models with different conditions (lighter EV-H, normal EV-M, rainy day/night EV-L) and compared the identification results to find the best models.
Finally, through analysis and comparison of the final identification effects, the YOLO v2, YOLO v3 and SSD road sign detection versions are proposed, and several models are raised to identify common road signs on the road and to instantly identify the main road signs in the driving record data. And, then the following conclusions are obtained: (1) In the YOLO class, Model Yolov2-lss21 performs better; In the SSD class, SSD 300 performs better. (2) In the YOLO model, the difference in the test sets (over-bright EV-H, normal EV-M, rainy/night EV-L) affects the recognition rate, but in the SSD-type model it shows less effect. (3) Combining the average recognition rate (mAP) of the experimental results with the frame per second (FPS) results, both Yolov2-lss21 and SSD 300 are the favorable, recommended models.

誌謝        i
摘要        ii
Abstract    iv
目錄        vi
圖目錄        ix
表目錄        xii
第一章 緒論    1
1    研究背景    1
2    文獻回顧    2
2.1    回顧圖庫數據集文獻    2
2.2    回顧深度學習目標辨識文獻    5
2.3    回顧車牌辨識與路標辨識文獻    7
3    研究動機與方法    9
4    論文貢獻    10
5    論文架構    10
第二章 深度學習物體偵測理論介紹    12
1    你只看一次(YOLO)即時偵測系統    12
1.1    YOLOv1    12
1.2    YOLOv2    16
1.3    YOLOv3    21
1.4   模型評估    23
2    單步多目標檢測器(SSD)即時偵測系統分析    25
2.1    架構分析    26
2.2    損失函數    27
3    K平均(K-Means)分群演算法    29
第三章 實驗規劃    34
1    實驗硬體設備介紹    34
1.1    行車記錄器    35
1.2    驗證機(NVIDIA Jetson TX2)    37
1.3    訓練主機    39
2    實驗方法    41
3    路標訓練影像    41
3.1    擷取影像    43
3.2    建立圖庫    45
3.3    標註(Labeling)    48
4    模型資料訓練    53
4.1    參數設置    54
4.2    進行訓練    58
4.3    保存模型    61
5    路標辨識實驗方法    62
5.1    使用驗證集    63
5.2    使用測試集    64
第四章 實驗結果與分析    67
1    訓練模型結果    67
2    驗證集測試結果    68
3    測試集測試結果    74
3.1    實驗一 過亮EV-H測試集    74
3.2    實驗二 雨天/黑夜EV-L測試集    80
3.3    實驗三 正常EV-M測試集    86
3.4    實驗四 所有測試集    92
4    實驗數據與結果    93
第五章 結論與建議    101
1    結論    101
2    未來研究方向    101
參考文獻        102

                                

[1] Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Berg, A. C. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211-252.
[2] Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2008). The pascal visual object classes challenge 2007 (voc 2007) results (2007).
[3] Everingham, M., Eslami, S. A., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2015). The pascal visual object classes challenge: A retrospective. International journal of computer vision, 111(1), 98-136.
[4] Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International journal of computer vision, 88(2), 303-338.
[5] Geiger, A., Lenz, P., & Urtasun, R. (2012, June). Are we ready for autonomous driving? the kitti vision benchmark suite. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on (pp. 3354-3361). IEEE.
[6] Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., ... & Zitnick, C. L. (2014, September). Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.
[7] Mottaghi, R., Chen, X., Liu, X., Cho, N. G., Lee, S. W., Fidler, S., ... & Yuille, A. (2014). The role of context for object detection and semantic segmentation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 891-898).
[8] Silberman, N., Hoiem, D., Kohli, P., & Fergus, R. (2012, October). Indoor segmentation and support inference from rgbd images. In European Conference on Computer Vision (pp. 746-760). Springer, Berlin, Heidelberg.
[9] Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., & Torralba, A. (2010, June). Sun database: Large-scale scene recognition from abbey to zoo. In Computer vision and pattern recognition (CVPR), 2010 IEEE conference on (pp. 3485-3492). IEEE.
[10] Song, S., Lichtenberg, S. P., & Xiao, J. (2015). Sun rgb-d: A rgb-d scene understanding benchmark suite. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 567-576).
[11] Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., ... & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3213-3223).
[12] Bell, S., Upchurch, P., Snavely, N., & Bala, K. (2013). Opensurfaces: A richly annotated catalog of surface appearance. ACM Transactions on Graphics (TOG), 32(4), 111.
[13] Bell, S., Upchurch, P., Snavely, N., & Bala, K. (2015). Material recognition in the wild with the materials in context database. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3479-3487).
[14] Caesar, H., Uijlings, J., & Ferrari, V. (2016). COCO-Stuff: Thing and stuff classes in context. CoRR, abs/1612.03716, 5, 8.
[15] Patterson, G., & Hays, J. (2016, October). Coco attributes: Attributes for people, animals, and objects. In European Conference on Computer Vision (pp. 85-100). Springer, Cham.
[16] Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. nature, 323(6088), 533-536.
[17] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-444.
[18] Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. science, 313(5786), 504-507.
[19] Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural computation, 18(7), 1527-1554.
[20] Bengio, Y., Lamblin, P., Popovici, D., & Larochelle, H. (2007). Greedy layer-wise training of deep networks. In Advances in neural information processing systems (pp. 153-160).
[21] LeCun, Y., Chopra, S., Hadsell, R., Huang, F. J., & Ranzato, M. A. (2006). A tutorial on energy-based learning, predicting structured outputs. MIT Press.
[22] Lee, H., Ekanadham, C., & Ng, A. Y. (2008). Sparse deep belief net model for visual area V2. In Advances in neural information processing systems (pp. 873-880).
[23] Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A. R., Jaitly, N., ... & Kingsbury, B. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal processing magazine, 29(6), 82-97.
[24] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
[25] Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis & Machine Intelligence, (6), 1137-1149.
[26] Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580-587).
[27] Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 1440-1448).
[28] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[29] Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional networks. In European conference on computer vision (pp. 818-833). Springer, Cham.
[30] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[31] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).
[32] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[33] Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788).
[34] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016, October). Ssd: Single shot multibox detector. In European conference on computer vision (pp. 21-37). Springer, Cham.
[35] Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. arXiv preprint.
[36] Uijlings, J. R., Van De Sande, K. E., Gevers, T., & Smeulders, A. W. (2013). Selective search for object recognition. International journal of computer vision, 104(2), 154-171.
[37] Li, H., & Shen, C. (2016). Reading car license plates using deep convolutional neural networks and LSTMs. arXiv preprint arXiv:1601.05610.
[38] Hsu, G. S., Ambikapathi, A., Chung, S. L., & Su, C. P. (2017, August). Robust license plate detection in the wild. In Advanced Video and Signal Based Surveillance (AVSS), 2017 14th IEEE International Conference on (pp. 1-6). IEEE.
[39] Hsu, G. S., Chen, J. C., & Chung, Y. Z. (2013). Application-oriented license plate recognition. IEEE transactions on vehicular technology, 62(2), 552-561.
[40] Moutarde, F., Bargeton, A., Herbin, A., & Chanussot, L. (2007, June). Robust on-vehicle real-time visual detection of American and European speed limit signs, with a modular Traffic Signs Recognition system. In Intelligent Vehicles Symposium, 2007 IEEE (pp. 1122-1126). IEEE.
[41] Loy, G., & Zelinsky, A. (2003). Fast radial symmetry for detecting points of interest. IEEE Transactions on Pattern Analysis & Machine Intelligence, (8), 959-973.
[42] Loy, G., & Barnes, N. (2004, September). Fast shape-based road sign detection for a driver assistance system. In Intelligent Robots and Systems, 2004.(IROS 2004). Proceedings. 2004 IEEE/RSJ International Conference on (Vol. 1, pp. 70-75). IEEE.
[43] Barnes, N., & Loy, G. (2006). Real-time regular polygonal sign detection. In Field and Service Robotics (pp. 55-66). Springer, Berlin, Heidelberg.
[44] Keller, C. G., Sprunk, C., Bahlmann, C., Giebel, J., & Baratoff, G. (2008, June). Real-time recognition of US speed signs. In Intelligent Vehicles Symposium, 2008 IEEE (pp. 518-523). IEEE.
[45] Larsson, F., & Felsberg, M. (2011, May). Using Fourier descriptors and spatial models for traffic sign recognition. In Scandinavian Conference on Image Analysis (pp. 238-249). Springer, Berlin, Heidelberg.
[46] Vázquez-Reina, A., Lafuente-Arroyo, S., Siegmann, P., Maldonado-Bascón, S., & Acevedo-Rodríguez, F. (2005, September). Traffic sign shape classification based on correlation techniques. In Proc. 5th WSEAS Int. Conf. Signal Process., Comput. Geometry Artif. Vis (pp. 149-154).
[47] Xu, S. (2009). Robust traffic sign shape recognition using geometric matching. IET Intelligent Transport Systems, 3(1), 10-18.
[48] Qingsong, X., Juan, S., & Tiantian, L. (2010, April). A detection and recognition method for prohibition traffic signs. In Image Analysis and Signal Processing (IASP), 2010 International Conference on (pp. 583-586). IEEE.
[49] Deguchi, D., Shirasuna, M., Doman, K., Ide, I., & Murase, H. (2011, June). Intelligent traffic sign detector: Adaptive learning based on online gathering of training samples. In Intelligent Vehicles Symposium (IV), 2011 IEEE (pp. 72-77). IEEE.
[50] Gu, Y., Yendo, T., Tehrani, M. P., Fujii, T., & Tanimoto, M. (2011, June). Traffic sign detection in dual-focal active camera system. In Intelligent Vehicles Symposium (IV), 2011 IEEE (pp. 1054-1059). IEEE.
[51] Yang, Y., Luo, H., Xu, H., & Wu, F. (2016). Towards real-time traffic sign detection and classification. IEEE Transactions on Intelligent Transportation Systems, 17(7), 2022-2031.
[52] Luo, H., Yang, Y., Tong, B., Wu, F., & Fan, B. (2018). Traffic sign recognition using a multi-task convolutional neural network. IEEE Transactions on Intelligent Transportation Systems, 19(4), 1100-1111.
[53] Lee, H. S., & Kim, K. (2018). Simultaneous Traffic Sign Detection and Boundary Estimation Using Convolutional Neural Network. IEEE Transactions on Intelligent Transportation Systems.
[54] Redmon, J., & Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767.
[55] MacQueen, J. (1967, June). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (Vol. 1, No. 14, pp. 281-297).
[56] Hartigan, J. A. (1975). Clustering algorithms (No. 04; QA278, H35.).
[57] Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1), 100-108.
[58] Arthur, D., & Vassilvitskii, S. (2006, June). How slow is the k-means method?. In Proceedings of the twenty-second annual symposium on Computational geometry (pp. 144-153). ACM.
[59] Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems (pp. 91-99).
[60] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929-1958.

全文公開日期 2024/01/24 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文