簡易檢索 / 詳目顯示

研究生: 張恩碩
CHANG,EN-SHUO
論文名稱: 基於深度神經網路實現即時道路及標線分割
Real-Time Lane and Road Marking Detection Using Deep Neural Networks
指導教授: 陳郁堂
Yie-Tarng Chen
口試委員: 陳省隆
Hsing-Lung Chen
林銘波
Ming-Bo Lin
林昌鴻
Chang Hong Lin
方文賢
Wen-Hsien Fang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 24
中文關鍵詞: Real-time Semantic SegmentationBilateral Segmentation NetworkRoad Mark Detect
外文關鍵詞: Real-time Semantic Segmentation, Bilateral Segmentation Network, Road Mark Detect
相關次數: 點閱:198下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

本文研究了檢測自動駕駛汽車車道和道路標線的問題。
我們將此問題作為語義分割問題提出,這由深度學習方案解決。
自動駕駛汽車必須實時分析道路現場。
因此,我們的研究重點是用於語義分割的新神經網絡模型,它可以在保持檢測精度的同時減少推理時間。
受ENet和BisNet的啟發,我們為此問題提出了兩種神經網絡模型。
首先,我們通過從ENet中刪除多個冗餘瓶頸層然後應用知識蒸餾方案來設計一個新的較小模型,該方案使用原始復雜和高精度ENet的結果作為軟標籤來指導新小的訓練,和快速神經網絡模型,以提高其檢測精度。
接下來,我們提出了一種具有兩個並行網絡的新型網絡架構:一個網絡可以推斷輸入的空間特徵,而另一個網絡負責擴展接收域並編碼上下文消息,然後將兩個網絡的結果組合到一起產生最終輸出。
為了評估兩個提議的網絡模型的性能,我們對ITRI道路標記數據集進行了實驗,實驗結果證明了我們提出的網絡模型的優越性,即使使用較少數量的訓練數據集。


This thesis investigates the problem of detecting lane and road markings for self-driving cars. We pose this problem as a semantic segmentation problem, which is addressed by the deep learning scheme. Self-driving cars must analyze road scene in real-time. Therefore, our research focuses on the new neural network models for semantic segmentation, which can reduce the inference time while maintaining the detection accuracy. Inspired by Enet and Bisnet, we propose two neural network models for this problem. First, we design a new smaller model by removing several redundant bottleneck layers from Enet and then applying the knowledge distillation scheme, which uses the results of the original complex and high- precision E -net as a soft tag to guide the training of the new small, and fast neural network model to boost its detection accuracy. Next, we propose a novel network architecture with two parallel networks: one network can infer the spatial features of the inputs, while the other network is responsible for extending the reception fields and encoding the context messages, and then combining the results from both networks to produce the final output. To assess the performance of the two proposed network models, we perform experiments on the ITRI road marking datasets , the experimental results demonstrate the superiority of our proposed network models even using a smaller number of training datasets.

Abstract ii Acknowledgment .iii Table of contents .iv List of Figures .vi List of Tables .vii List of Acronyms viii 1 Introduction .1 2 Related Work .3 2.1 ENet Semantic Segmentation .3 2.2 BiseNet .3 2.3 Knowledge Distilling .5 2.4 MobileNet .5 3 The Proposed Real-time Networks for Road Marks Detection .6 3.1 Bilateral ENet .6 3.1.1 Network architecture .6 3.1.2 Context path .7 3.1.3 Spatial path .8 3.1.4 Attention refinement Module .9 3.1.5 Feature Fusion Module .9 3.1.6 Loss function .9 3.2 Use knowledge distillation to keep accuracy .10 3.2.1 Light weight ENet .10 3.2.2 Knowledge distillation .11 4 Experimental Results .13 4.1 Dataset .13 4.2 Bilateral ENet Experimental Results .15 4.2.1 Dataset .15 4.2.2 Bilateral Enet details and Result .15 4.3 Knowledge distillation and Details .18 4.3.1 Training details and Result .18 5 Conclusion .21References .22

[1] A. Paszke, A. Chaurasia, S. Kim, and E. Culurciello, "Enet: A deep neural network architecture for real-time semantic segmentation," arXiv preprint arXiv:1606.02147, 2016.
[2] C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. Sang, "Bisenet: Bilateralsegmentation network for real-time semantic segmentation," in Proceedingsof the European Conference on Computer Vision (ECCV), pp. 325-341, 2018.
[3] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Doll_ar,and C. L. Zitnick, "Microsoft coco: Common objects in context," in Europeanconference on computer vision, pp. 740-755, Springer, 2014.
[4] J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for se-mantic segmentation," in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition, pp. 3431-3440, 2015.
[5] O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networksfor biomedical image segmentation," in International Conference on Medicalimage computing and computer-assisted intervention, pp. 234-241, Springer,2015.
[6] K. He, G. Gkioxari, P. Doll_ar, and R. Girshick, "Mask r-cnn," in Proceedingsof the IEEE International Conference on Computer Vision, pp. 2961-2969,2017.
[7] J. Greenhalgh and M. Mirmehdi, "Automatic detection and recognition ofsymbols and text on the road surface," in International Conference on Pat-tern Recognition Applications and Methods, pp. 124-140, Springer, 2015.
[8] S. Lee, J. Kim, J. Shin Yoon, S. Shin, O. Bailo, N. Kim, T.-H. Lee,H. Seok Hong, S.-H. Han, and I. So Kweon, "Vpgnet: Vanishing point guided22network for lane and road marking detection and recognition," in Proceedings of the IEEE International Conference on Computer Vision, pp. 1947-1955,2017.
[9] C. Peng, X. Zhang, G. Yu, G. Luo, and J. Sun, "Large kernel matters-improve semantic segmentation by global convolutional network," in Proceed-ings of the IEEE Conference on Computer Vision and Pattern Recognition,pp. 4353-4361, 2017.
[10] G. Hinton, O. Vinyals, and J. Dean, "Distilling the knowledge in a neuralnetwork," arXiv preprint arXiv:1503.02531, 2015.
[11] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand,M. Andreetto, and H. Adam, "Mobilenets: E_cient convolutional neuralnetworks for mobile vision applications," arXiv preprint arXiv:1704.04861,2017.
[12] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for imagerecognition," in Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition, pp. 770-778, 2016.
[13] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, "Pyramid scene parsing net-work," in Proceedings of the IEEE Conference on Computer Vision and Pat-tern Recognition, pp. 2881-2890, 2017.
[14] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille,"Deeplab: Semantic image segmentation with deep convolutional nets,atrous convolution, and fully connected crfs," IEEE transactions on patternanalysis and machine intelligence, vol. 40, no. 4, pp. 834-848, 2017.
[15] L.-C. Chen, G. Papandreou, F. Schro_, and H. Adam, "Rethinkingatrous convolution for semantic image segmentation," arXiv preprint arXiv:1706.05587, 2017.23
[16] W. Liu, A. Rabinovich, and A. C. Berg, "Parsenet: Looking wider to seebetter," arXiv preprint arXiv:1506.04579, 2015.
[17] V. Badrinarayanan, A. Kendall, and R. Cipolla, "Segnet: A deep convolu-tional encoder-decoder architecture for image segmentation," IEEE transac-tions on pattern analysis and machine intelligence, vol. 39, no. 12, pp. 2481-2495, 2017.
[18] S. Xie and Z. Tu, "Holistically-nested edge detection," in Proceedings of theIEEE International Conference on Computer Vision, pp. 1395-1403, 2015.
[19] S. Io_e and C. Szegedy, "Batch normalization: Accelerating deepnetwork training by reducing internal covariate shift," arXiv preprintarXiv:1502.03167, 2015.
[20] X. Glorot, A. Bordes, and Y. Bengio, "Deep sparse recti_er neural networks,"in Proceedings of the fourteenth international conference on arti_cial intelli-gence and statistics, pp. 315-323, 2011.

QR CODE