簡易檢索 / 詳目顯示

研究生: 許華哲
Hua-Che Hsu
論文名稱: 運用多尺度錐體點網路進行三維物件偵測之研究
Study of Applying Multi-Scale Frustum PointNets to 3D Object Detection
指導教授: 呂政修
Jenq-Shiou Leu
陳維美
Wei-Mei Chen
口試委員: 林淵翔
Yuan-Hsiang Lin
林昌鴻
Chang Hong Lin
呂政修
Jenq-Shiou Leu
陳維美
Wei-Mei Chen
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 41
中文關鍵詞: 多尺度機制錐體點網路三維物件偵測點雲點網路自駕車
外文關鍵詞: Multi-Scale Mechanism, Frustum PointNet, 3D Object Dection, Point Clouds, PointNet, Self-Driving Car
相關次數: 點閱:347下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 自動駕駛汽車會是今後最重要的技術之一,自動駕駛技術尚未到達完全自動控制之前必須要有駕駛員的輔助控制,很多時候交通事故的主要原因除了人為造成以外,另一個原因是自動駕駛汽車感知周圍環境的辨識系統不成功。自動駕駛和人類駕駛一樣需要擁有三維的視覺,目前已經存在多種三維物件辨識的方法,在這篇論文中,我們提出一個能改進現有使用點雲做三維物件辨識的方法。基於錐體點網路模型並使用點雲作為預測依據,在三維空間中以原點(0,0,0)對點雲中的每個三維座標點進行0.125倍、不變和8倍的縮放,三組大小的點雲各別使用三層卷積神經網路提取特徵,最後再將三組特徵串聯為一組特徵。我們針對三組不同倍數的點雲使用相對應卷積神經網路的卷積核大小,倍數較小的點雲使用較小的卷積核,倍數較大的點雲使用較大的卷積核。倍數較小的點雲的卷積核大小在每層卷積神經網路中依層增加,倍數較大的點雲的卷積核大小在每層卷積神經網路中依層減少,倍數不變的點雲在每層卷積神經網路使用相同的卷積核大小,此方法如同人類觀測物體並能從不同倍數的點雲中提取有用且不同的點雲特徵。我們使用KITTI提供的驗證方法比較改進前後的模型,結果顯示,我們的模型辨識出的精確率比模型改進前的精確率還要好。


    Abstract—Self-driving cars will be one of the most important technologies in the future. The driver’s assisted control is necessary before the automatic driving technology reaches full automatic control. Therefore, the main cause of self-driving car accidents is not only caused by humans, but also detections. Self-driving cars need to have the same three-dimensional vision as humans. At present, there are a variety of three-dimensional object detection methods. In this paper, we propose a method that can improve existing methods of using point clouds for three-dimensional object detection. Based on the Frustum PointNets for 3D Object Detection from RGB-D Data and using the point clouds as the basis for prediction. Each three-dimensional coordinate point in point clouds is scaled by 0.125 times, unchanged and 8 times with the origin (0,0,0) in the three-dimensional space. Three sets of point clouds use three-layer convolutional neural networks to extract features, and three sets of features are concatenated into one set of features. We use the convolution kernel corresponding to three different multiples of point clouds. The smaller multiples of point clouds use the smaller convolution kernel, and the larger multiples of point clouds use the larger convolution kernel. The convolution kernel of point clouds with a smaller multiple is increased in each layer of the convolutional neural network, and the convolution kernel of point clouds with a larger multiple is reduced in each layer of the convolutional neural network. Point clouds with the same multiple uses the same convolution kernel in each layer of convolutional neural network. This method is similar to how humans observe objects and extract useful features from different multiples of point clouds. We use the verification method provided by the KITTI to compare the model before and after the improvement, and the results show that the precision of our model detection is better than the original model.

    論文摘要 ABSTRACT 誌謝 目錄 圖表 第1章 緒論 1.1 研究背景與動機 1.2 研究目的 1.3 章節提要 第2章 三維物件辨識相關技術 2.1 神經網路 2.1.1 神經元 2.1.2 單層神經網路 2.1.3 多層神經網路 2.1.4 卷積神經網路 2.2 YOLO二維物件偵測 2.3 R-CNN二維物件偵測 2.4 二維多尺度機制 2.5 GS3D三維物件偵測 2.5 點網路 2.5.1 點網路架構 2.6 錐體點網路 2.6.1 錐體建立 2.6.2 三維實例分割 2.6.3 三維邊框回歸 第3章 多尺度錐體點網路 3.1 設計步驟 3.2 系統架構 3.3 縮放點雲階段 3.4 提取點雲特徵階段 3.5 不同大小的卷積核階段 3.6 依層增加和減少的卷積核階段 3.7 串聯點雲特徵階段 第4章 實驗與結果 4.1 硬體設備介紹 4.2 軟體工具介紹 4.3 KITTI資料集介紹 4.3.1資料採集平台 4.3.2物件檢測 4.3.3資料驗證 4.4 結果分析 4.4.1 多尺寸結果分析 4.4.2 不同卷積核結果分析 4.4.3 依層數增加與減少卷積核結果分析 4.4.4 方法比較 4.4.5 結果表示 第5章 結論 參考文獻

    [1] SAE International, "Automated Driving Levels of Driving Automation," New SAE International Standard J3016, 2014.
    [2] R. Q. Charles, H. Su, M. Kaichun and L. J. Guibas, "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 77-85.
    [3] C. R. Qi, W. Liu, C. Wu, H. Su and L. J. Guibas, "Frustum PointNets for 3D Object Detection from RGB-D Data," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 918-927.
    [4] L. Chen, Y. Yang, J. Wang, W. Xu and A. L. Yuille, "Attention to Scale: Scale-Aware Semantic Image Segmentation," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 3640-3649.
    [5] Y. Bai, L. Dong, X. Huang, W. Yang and M. Liao, "Hierarchical segmentation of polarimetric SAR image via Non-Parametric Graph Entropy," 2014 IEEE Geoscience and Remote Sensing Symposium, 2014, pp. 2786-2789.
    [6] A. Geiger, P. Lenz, C. Stiller and R. Urtasun, “Vision Meets Robotics: The KITTI Dataset,” The International Journal of Robotics Research, vol. 32, no. 11, 2013, pp. 1231–1237.
    [7] M. Everingham, L. Van~Gool, Williams, C. K. I., J. Winn and A. Zisserman, "The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results," http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html, 2007.
    [8] A. Simonelli, S. R. Bulò, L. Porzi, M. Lopez-Antequera and P. Kontschieder, "Disentangling Monocular 3D Object Detection," 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1991-1999.
    [9] Jeffrey G. Andrews, Stefano Buzzi, Wan Choi, Stephen V. Hanly, Angel Lozano, Anthony C. K. Soong and Jianzhong Charlie Zhang, "What Will 5G Be?," in IEEE Journal on Selected Areas in Communications, vol. 32, no. 6, pp. 1065-1082, June 2014, doi: 10.1109/JSAC.2014.2328098.
    [10] M. Daily, S. Medasani, R. Behringer and M. Trivedi, "Self-Driving Cars," in Computer, vol. 50, no. 12, pp. 18-23, December 2017, doi: 10.1109/MC.2017.4451204.
    [11] L. K. Hansen and P. Salamon, "Neural network ensembles," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 10, pp. 993-1001, Oct. 1990, doi: 10.1109/34.58871.
    [12] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 779-788, doi: 10.1109/CVPR.2016.91.
    [13] R. Girshick, J. Donahue, T. Darrell and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 580-587, doi: 10.1109/CVPR.2014.81.
    [14] Pedro F. Felzenszwalb and Daniel P. Huttenlocher, "Efficient Graph-based Image Segmentation," International journal of computer vision, 2004, pp. 167-181.
    [15] B. Li, W. Ouyang, L. Sheng, X. Zeng and X. Wang, "GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 1019-1028, doi: 10.1109/CVPR.2019.00111.
    [16] R. Girshick, "Fast r-cnn," Proceedings of the IEEE international conference on computer vision, 2015.
    [17] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 1 June 2017, doi: 10.1109/TPAMI.2016.2577031.
    [18] Charles R. Qi, Li Yi, Hao Su and Leonidas J. Guibas, "Pointnet++: Deep hierarchical feature learning on point sets in a metric space," Advances in neural information processing systems, 2017.

    QR CODE