基於端對端語義分割訓練之停車格偵測系統｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	簡德輝 De-Hui Jian
論文名稱：	基於端對端語義分割訓練之停車格偵測系統 Vision-Based Parking Slot Detection Based on End-to-End Semantic Segmentation Training
指導教授：	林昌鴻 Chang-Hong Lin
口試委員:	呂政修 Jenq-Shiou Leu 林宗男 Tsung-nan Lin 吳沛遠 Pei-Yuan Wu 林昌鴻 Chang-Hong Lin
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	72
中文關鍵詞：	自動停車系統、停車格偵測、深度學習、語義分割、多任務學習、端對端訓練
外文關鍵詞：	Automatic parking systems, Parking Slot Detection, Deep Learning, Semantic Segmentation, Multi-task learning, End-to-End Training
相關次數：	點閱：334 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

自動停車系統在自駕車的領域中是一大挑戰，特別是在影像中停車格會因為外在環境的改變而導致特徵不清楚，例如：雨天、早上及晚上等等……，因此造成實作上的困難。在自動停車技術之前，許多車子普遍會配置倒車雷達作為駕駛停車時的輔助系統，但對於新手或是不熟悉如何停車的駕駛來說，此輔助系統能幫助的地方則有限。由於深度學習技術的進步，許多方法開始使用影像搭配深度學習的方式進行停車格偵測。以影像為基礎的停車格偵測最大的優點是可以有效地分析空間上的資訊，例如：停車格大小、停車格角度及座標等等。而目前針對影像的停車格偵測系統，大多數的方法都是先找出停車格的角落中心點座標，以此為依據去分析停車格是否合法，但此種方法在後處理的時候會因為丟失太多資訊而導致準確度下降，而目前準確度最高的分法特別訓練兩個深度學習的模型，分別去預測停車格的角落中心座標及停車格的類型，此方法擁有很高的準確率，但因為不是一個端對端的模型，會需要花費相當多時間準備三份不同的資料集以及考慮兩個模型連接時的問題。本論文所提出的方法是透過多任務學習(Multi-task Learning)的方式將兩個相同的語義分割（Semantic Segmentation）模型串聯起來一起訓練，訓練的資料分別為停車格的線及角落中心點圖片，搭配後處理的方式找出停車格座標。本論文的平均召回率（Recall rate）達95.06%、精確率（Precision rate）達99.47%及F值（F-measure）達97.22%，是目前端對端模型中最好的結果。

Automatic parking systems (APSs) is a big challenge in the field of self-driving car. The main reason is the features of parking slots will be affect if the environments are different, such as rainy day, in the morning or at night and so on. Before the Automatic Parking System, lots of car were equipped with Reversing Radars to assist drivers to park, but it is not useful for beginners who are not good at parking. With the development of machine and deep learning, lots of methods use this technique to do parking slot detection. Recently, there is a high accuracy method uses two deep learning models with post processing to find parking slots, but it is not an end-to-end training model, and need to spend lots of time to prepare three datasets and train two neural networks. This thesis proposed an end-to-end training method based on semantic segmentation. With multi-task learning, we combine two semantic segmentation models to obtain the line and point images of parking slots. Finally, we design an algorithm to find the coordinates of parking slots. The Recall, Precision and F-measure of the proposed method are 95.06%, 99.47% and 97.22%, respectively, which are better than other end-to-end training methods.

摘要    I
Abstract    II
致謝    III
LIST OF CONTENTS    IV
LIST OF FIGURES    VI
LIST OF TABLES    VIII
CHAPTER 1    INTRODUCTIONS    9
1.1    Motivation    9
1.2    Contributions    11
1.3    Thesis Organization    12
CHAPTER 2    RELATED WORKS    13
2.1    Vision-based Parking Slot Detection based on Image Processing    13
2.2    Vision-based Parking Slot Detection based on Machine Learning    14
2.3    Vision-based Parking Slot Detection based on Deep Learning    15
CHAPTER 3    U-NET [11] INTRODOCTION    16
3.1    Convolution Layer    17
3.2    Rectified Linear Units (ReLU)    19
3.3    Dropout [36] and Batch Normalization [15]    20
3.4    Max Pooling    21
3.5    Transpose Convolution    22
3.6    Concatenation    23
3.7    Loss Function    23
3.8    Optimization    24

CHAPTER 4    PROPOSED METHOD    25
4.1    Deep Neural Networks    25
4.1.1    Multi-task Learning    25
4.1.2    Ground Truth Setting    34
4.1.3    Data Augmentation    38
4.2    Post Processing    40
4.2.1    Connected Component    41
4.2.2    Finding P1 and P2 Based on the Entrance Lines    42
4.2.3    Finding P3 and P4 Based on the Separating Lines    45
4.2.4    Filter out Invalid Parking Slots    48
4.2.5      Extending Separating Lines and Finding Coordinate Sequences of Parking Slots    49
CHAPTER 5    EXPERIMENTAL RESULTS    51
5.1    Experimental Environment    51
5.2    Parking Slot Database    52
5.3    Performance Evaluation    55
5.4    Compared existing and analyses    55
CHAPTER 6    CONCLUSIONS AND FUTURE WORKS    64
6.1    Conclusions    64
6.2    Future Works    65
REFERENCES      66


                                

[1] Newelectronics website: http://www.newelectronics.co.uk/electronics-technology/an-introduction-to-ultrasonic-sensors-for-vehicle-parking/24966/, [Online].
[2] L. Zhang, J. Huang, X. Li and L. Xiong, “Vision-Based Parking-slot detection: A DCNN-Based Approach and a Large-Scale Benchmark Dataset,” IEEE Transactions on Image Processing (ITIP), vol. 27, no. 11, pages 5350-5364, Nov. 2018.
[3] L. Li, L. Zhang, X. Li, X. Liu, Y. Shen, and L. Xiong, “Vision-based parking-slot detection: A benchmark and a learning-based approach,” in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pages 649–654, Jul. 2017.
[4] J. K. Suhr, and H. G. Jung "Full-automatic recognition of various parking slot markings using a hierarchical tree structure," in Optical Engineering 52(3), 037203, Mar. 2013.
[5] K. Hamada, Z. Hu, M. Fan, and H. Chen, “Surround view based parking lot detection and tracking,” in IEEE Intelligent Vehicles Symposium (IV)
, pages 1106-1111, Jun. 28 – Jul. 1, 2015.
[6] J. Redmon and A. Farhadi, “Yolo9000: Better, faster, stronger,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 6517–6525, 2017.
[7] H. G. Jung, D. S. Kim, P. J. Yoon, and J. Kim, “Parking slot markings recognition for automatic parking assist system,” in Intelligent Vehicles Symposium (IV), pages 106–113, 2006.
[8] J. C. Balloch, V. Agrawal, I. Essa, and S. Chernova, “Unbiasing semantic segmentation for robot perception using synthetic data feature transfer,” in arXiv preprint arXiv:1809.03676, 2018.
[9] C. Wang, H. Zhang, M. Yang, X. Wang, L. Ye, and C. Guo, “Automatic parking based on a bird’s eye view vision system,” in Mechanical Engineering Volume 2014, Article ID 847406,
[10] T. Shen, G. Lin, L. Liu, C. Shen, and I. Reid, “Weakly supervised semantic segmentation based on web image co-segmentation”, in arXiv preprint arXiv:1705.09052, 2017
[11] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), pages 234–241, 2015.
[12] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3431-3440, 2015.
[13] V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in International Conference on Machine Learning (ICML), 2010.
[14] Math Bench website: https://mathbench.umd.edu/modules/cell-processes_diffusion/page05.htm, [online]
[15] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International Conference on Machine Learning (ICML), 2015.
[16] F. Milletari, N. Navab, and S.-A. Ahmadi, “V-net: Fully convolutional neural networks for volumetric medical image segmentation,” in 2016 Fourth International Conference on 3D Vision (3DV), pages 565-571, 2016.
[17] C. Harries, and M. Stephens, “A combined corner and edge detector,” in 4th Alvey Vision Conference (AVC), pages 147–151, 1988.
[18] R. D. Duda and P. E. Hart, “Use of the Hough transform to detect lines and curves in pictures,” in Communications of the ACM (CACM), pages 11-15, 1972.
[19] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” in Journal of Machine Learning Research (JMLR), pages 1929-1958, 2014.
[20] Hung-yi Lee mechine learning course website from NTU: http://speech.ee.ntu.edu.tw/~tlkagk/index.html, [online]
[21] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Advances in neural information processing systems (NIPS), pages 91–99, 2015.
[22] Motor Vehicle Driver Information Service (MVDIS) law searching system website: https://www.mvdis.gov.tw/webMvdisLaw/LawContent.aspx?LawID=G0058000, [online]
[23] cheguanyi website (the rules of car in China): http://www.cheguanyi.com/cheguan/2331.html, [online]
[24] P. R. Evans, “Rotations and rotation matrices,” in Acta Crystallographica Section D: Biological Crystallography (IUCr), 57 (pt 10), pages 1355-1359, 2001.
[25] R. A. Hummel, B. Kimia, and S.W. Zucker, “Deblurring Gaussian Blur,” New York University, Courant Institute of Mathematical Sciences, Computer Science Division, 1986. [book]
[26] J.A. Storer, “An Introduction to Data Structures and Algorithms,” Springer Science & Business Media, 2012. [book]
[27] COCO evaluation metrics: http://cocodataset.org/#detection-eval, [online]
[28] M. J. D. Smith, M. F. Goodchild, and P. Longley, ” Geospatial Analysis: A Comprehensive Guide to Principles, Techniques and Software Tools,” Troubador Publishing Ltd, 2007. [book]
[29] Python website: https://www.python.org/, [online]
[30] Keras website: https://keras.io/, [online]
[31] P. Viola and M. Jones, “Rapid Object Detection Using a Boosted Cascade of Simple Features,” in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pages 511-518, 2001.
[32] D. P. Kingma and L. J. Ba, “ADAM: A Method for Stochastic Optimization,” in International Conference on Learning Representations (ICLR), 2015.
[33] I. Bilbao and J. Bilbao ,”Overfitting problem and the over-training in the era of data: Particularly for Artificial Neural Networks,” in International Conference on the Internet, Cyber Security and Information Systems (ICICIS), pages 173-177, 2017.
[34] S. Helgason, “The Radon Transform,” Springer Science & Business Media, 1999. [book]
[35] C. I. Gonzalez, P. Melin, J. R. Castro, and O. Castillo, ”Edge Detection Methods Based on Generalized Type-2 Fuzzy Logic,” Springer, 2017. [book]
[36] G. B. Thomas, Jr., M. D. Weir, J. R. Hass, “Thomas' Calculus: Early Transcendentals in SI Units,” Pearson Education Limited, 2016. [book]
[37] S. Haykin, and S. S. Haykin, “Neural Networks and Learning Machines,” Pearson Education, 2011. [book]
[38] Y. Huang, S. Chen, Y. Chen, Z. Jian, and N. Zheng, “Spatial-temporal based lane detection using deep learning,” in International Conference on Artificial Intelligence Applications and Innovations (IFIP), 2018.

全文公開日期 2024/07/25 (校內網路)
全文公開日期 2024/07/25 (校外網路)
全文公開日期 2024/07/25 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文