簡易檢索 / 詳目顯示

研究生: 簡德輝
De-Hui Jian
論文名稱: 基於端對端語義分割訓練之停車格偵測系統
Vision-Based Parking Slot Detection Based on End-to-End Semantic Segmentation Training
指導教授: 林昌鴻
Chang-Hong Lin
口試委員: 呂政修
Jenq-Shiou Leu
林宗男
Tsung-nan Lin
吳沛遠
Pei-Yuan Wu
林昌鴻
Chang-Hong Lin
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 72
中文關鍵詞: 自動停車系統停車格偵測深度學習語義分割多任務學習端對端訓練
外文關鍵詞: Automatic parking systems, Parking Slot Detection, Deep Learning, Semantic Segmentation, Multi-task learning, End-to-End Training
相關次數: 點閱:334下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

自動停車系統在自駕車的領域中是一大挑戰,特別是在影像中停車格會因為外在環境的改變而導致特徵不清楚,例如:雨天、早上及晚上等等……,因此造成實作上的困難。在自動停車技術之前,許多車子普遍會配置倒車雷達作為駕駛停車時的輔助系統,但對於新手或是不熟悉如何停車的駕駛來說,此輔助系統能幫助的地方則有限。由於深度學習技術的進步,許多方法開始使用影像搭配深度學習的方式進行停車格偵測。以影像為基礎的停車格偵測最大的優點是可以有效地分析空間上的資訊,例如:停車格大小、停車格角度及座標等等。而目前針對影像的停車格偵測系統,大多數的方法都是先找出停車格的角落中心點座標,以此為依據去分析停車格是否合法,但此種方法在後處理的時候會因為丟失太多資訊而導致準確度下降,而目前準確度最高的分法特別訓練兩個深度學習的模型,分別去預測停車格的角落中心座標及停車格的類型,此方法擁有很高的準確率,但因為不是一個端對端的模型,會需要花費相當多時間準備三份不同的資料集以及考慮兩個模型連接時的問題。本論文所提出的方法是透過多任務學習(Multi-task Learning)的方式將兩個相同的語義分割(Semantic Segmentation)模型串聯起來一起訓練,訓練的資料分別為停車格的線及角落中心點圖片,搭配後處理的方式找出停車格座標。本論文的平均召回率(Recall rate)達95.06%、精確率(Precision rate)達99.47%及F值(F-measure)達97.22%,是目前端對端模型中最好的結果。


Automatic parking systems (APSs) is a big challenge in the field of self-driving car. The main reason is the features of parking slots will be affect if the environments are different, such as rainy day, in the morning or at night and so on. Before the Automatic Parking System, lots of car were equipped with Reversing Radars to assist drivers to park, but it is not useful for beginners who are not good at parking. With the development of machine and deep learning, lots of methods use this technique to do parking slot detection. Recently, there is a high accuracy method uses two deep learning models with post processing to find parking slots, but it is not an end-to-end training model, and need to spend lots of time to prepare three datasets and train two neural networks. This thesis proposed an end-to-end training method based on semantic segmentation. With multi-task learning, we combine two semantic segmentation models to obtain the line and point images of parking slots. Finally, we design an algorithm to find the coordinates of parking slots. The Recall, Precision and F-measure of the proposed method are 95.06%, 99.47% and 97.22%, respectively, which are better than other end-to-end training methods.

摘要 I Abstract II 致謝 III LIST OF CONTENTS IV LIST OF FIGURES VI LIST OF TABLES VIII CHAPTER 1 INTRODUCTIONS 9 1.1 Motivation 9 1.2 Contributions 11 1.3 Thesis Organization 12 CHAPTER 2 RELATED WORKS 13 2.1 Vision-based Parking Slot Detection based on Image Processing 13 2.2 Vision-based Parking Slot Detection based on Machine Learning 14 2.3 Vision-based Parking Slot Detection based on Deep Learning 15 CHAPTER 3 U-NET [11] INTRODOCTION 16 3.1 Convolution Layer 17 3.2 Rectified Linear Units (ReLU) 19 3.3 Dropout [36] and Batch Normalization [15] 20 3.4 Max Pooling 21 3.5 Transpose Convolution 22 3.6 Concatenation 23 3.7 Loss Function 23 3.8 Optimization 24 CHAPTER 4 PROPOSED METHOD 25 4.1 Deep Neural Networks 25 4.1.1 Multi-task Learning 25 4.1.2 Ground Truth Setting 34 4.1.3 Data Augmentation 38 4.2 Post Processing 40 4.2.1 Connected Component 41 4.2.2 Finding P1 and P2 Based on the Entrance Lines 42 4.2.3 Finding P3 and P4 Based on the Separating Lines 45 4.2.4 Filter out Invalid Parking Slots 48 4.2.5 Extending Separating Lines and Finding Coordinate Sequences of Parking Slots 49 CHAPTER 5 EXPERIMENTAL RESULTS 51 5.1 Experimental Environment 51 5.2 Parking Slot Database 52 5.3 Performance Evaluation 55 5.4 Compared existing and analyses 55 CHAPTER 6 CONCLUSIONS AND FUTURE WORKS 64 6.1 Conclusions 64 6.2 Future Works 65 REFERENCES 66

[1] Newelectronics website: http://www.newelectronics.co.uk/electronics-technology/an-introduction-to-ultrasonic-sensors-for-vehicle-parking/24966/, [Online].
[2] L. Zhang, J. Huang, X. Li and L. Xiong, “Vision-Based Parking-slot detection: A DCNN-Based Approach and a Large-Scale Benchmark Dataset,” IEEE Transactions on Image Processing (ITIP), vol. 27, no. 11, pages 5350-5364, Nov. 2018.
[3] L. Li, L. Zhang, X. Li, X. Liu, Y. Shen, and L. Xiong, “Vision-based parking-slot detection: A benchmark and a learning-based approach,” in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pages 649–654, Jul. 2017.
[4] J. K. Suhr, and H. G. Jung "Full-automatic recognition of various parking slot markings using a hierarchical tree structure," in Optical Engineering 52(3), 037203, Mar. 2013.
[5] K. Hamada, Z. Hu, M. Fan, and H. Chen, “Surround view based parking lot detection and tracking,” in IEEE Intelligent Vehicles Symposium (IV)
, pages 1106-1111, Jun. 28 – Jul. 1, 2015.
[6] J. Redmon and A. Farhadi, “Yolo9000: Better, faster, stronger,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 6517–6525, 2017.
[7] H. G. Jung, D. S. Kim, P. J. Yoon, and J. Kim, “Parking slot markings recognition for automatic parking assist system,” in Intelligent Vehicles Symposium (IV), pages 106–113, 2006.
[8] J. C. Balloch, V. Agrawal, I. Essa, and S. Chernova, “Unbiasing semantic segmentation for robot perception using synthetic data feature transfer,” in arXiv preprint arXiv:1809.03676, 2018.
[9] C. Wang, H. Zhang, M. Yang, X. Wang, L. Ye, and C. Guo, “Automatic parking based on a bird’s eye view vision system,” in Mechanical Engineering Volume 2014, Article ID 847406,
[10] T. Shen, G. Lin, L. Liu, C. Shen, and I. Reid, “Weakly supervised semantic segmentation based on web image co-segmentation”, in arXiv preprint arXiv:1705.09052, 2017
[11] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), pages 234–241, 2015.
[12] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3431-3440, 2015.
[13] V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in International Conference on Machine Learning (ICML), 2010.
[14] Math Bench website: https://mathbench.umd.edu/modules/cell-processes_diffusion/page05.htm, [online]
[15] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International Conference on Machine Learning (ICML), 2015.
[16] F. Milletari, N. Navab, and S.-A. Ahmadi, “V-net: Fully convolutional neural networks for volumetric medical image segmentation,” in 2016 Fourth International Conference on 3D Vision (3DV), pages 565-571, 2016.
[17] C. Harries, and M. Stephens, “A combined corner and edge detector,” in 4th Alvey Vision Conference (AVC), pages 147–151, 1988.
[18] R. D. Duda and P. E. Hart, “Use of the Hough transform to detect lines and curves in pictures,” in Communications of the ACM (CACM), pages 11-15, 1972.
[19] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” in Journal of Machine Learning Research (JMLR), pages 1929-1958, 2014.
[20] Hung-yi Lee mechine learning course website from NTU: http://speech.ee.ntu.edu.tw/~tlkagk/index.html, [online]
[21] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Advances in neural information processing systems (NIPS), pages 91–99, 2015.
[22] Motor Vehicle Driver Information Service (MVDIS) law searching system website: https://www.mvdis.gov.tw/webMvdisLaw/LawContent.aspx?LawID=G0058000, [online]
[23] cheguanyi website (the rules of car in China): http://www.cheguanyi.com/cheguan/2331.html, [online]
[24] P. R. Evans, “Rotations and rotation matrices,” in Acta Crystallographica Section D: Biological Crystallography (IUCr), 57 (pt 10), pages 1355-1359, 2001.
[25] R. A. Hummel, B. Kimia, and S.W. Zucker, “Deblurring Gaussian Blur,” New York University, Courant Institute of Mathematical Sciences, Computer Science Division, 1986. [book]
[26] J.A. Storer, “An Introduction to Data Structures and Algorithms,” Springer Science & Business Media, 2012. [book]
[27] COCO evaluation metrics: http://cocodataset.org/#detection-eval, [online]
[28] M. J. D. Smith, M. F. Goodchild, and P. Longley, ” Geospatial Analysis: A Comprehensive Guide to Principles, Techniques and Software Tools,” Troubador Publishing Ltd, 2007. [book]
[29] Python website: https://www.python.org/, [online]
[30] Keras website: https://keras.io/, [online]
[31] P. Viola and M. Jones, “Rapid Object Detection Using a Boosted Cascade of Simple Features,” in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pages 511-518, 2001.
[32] D. P. Kingma and L. J. Ba, “ADAM: A Method for Stochastic Optimization,” in International Conference on Learning Representations (ICLR), 2015.
[33] I. Bilbao and J. Bilbao ,”Overfitting problem and the over-training in the era of data: Particularly for Artificial Neural Networks,” in International Conference on the Internet, Cyber Security and Information Systems (ICICIS), pages 173-177, 2017.
[34] S. Helgason, “The Radon Transform,” Springer Science & Business Media, 1999. [book]
[35] C. I. Gonzalez, P. Melin, J. R. Castro, and O. Castillo, ”Edge Detection Methods Based on Generalized Type-2 Fuzzy Logic,” Springer, 2017. [book]
[36] G. B. Thomas, Jr., M. D. Weir, J. R. Hass, “Thomas' Calculus: Early Transcendentals in SI Units,” Pearson Education Limited, 2016. [book]
[37] S. Haykin, and S. S. Haykin, “Neural Networks and Learning Machines,” Pearson Education, 2011. [book]
[38] Y. Huang, S. Chen, Y. Chen, Z. Jian, and N. Zheng, “Spatial-temporal based lane detection using deep learning,” in International Conference on Artificial Intelligence Applications and Innovations (IFIP), 2018.

無法下載圖示 全文公開日期 2024/07/25 (校內網路)
全文公開日期 2024/07/25 (校外網路)
全文公開日期 2024/07/25 (國家圖書館:臺灣博碩士論文系統)
QR CODE