簡易檢索 / 詳目顯示

研究生: 邱勇智
Yung-Chih Chiu
論文名稱: 台灣路側單元 3D 資料集建置及驗證
Creation and Validation of Taiwan’s Roadside Unit 3D Dataset in Taiwan
指導教授: 郭重顯
Chung-Hsien Kuo
口試委員: 黃漢邦
Han-Pang Huang
林其禹
Chyi-Yeu Lin
陸敬互
Ching-Hu Lu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 77
中文關鍵詞: 3D物件辨識LiDAR深度神經網路點雲資料集
外文關鍵詞: 3D object detection, LiDAR, Deep neural network, Point cloud dataset
相關次數: 點閱:241下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

一個資料集的好壞能大大的影響深度學習模型訓練出來的準確率,若資料集不完全或是與使用環境差異很大,常常會造成模型辨識的錯誤。在交通物件資料集中,大部分的資料集都是來自國外,但國外的交通環境與台灣的相差很大,例如國外主要是以汽車為主要交通物件,而台灣則是以機車為主,使用這些資料集來訓練在台灣的環境下測試就會有環境差異大造成辨識錯誤,但可惜的是台灣的交通物件資料集很少。因此本論文提出一台灣路側單元3D資料集,透過相機與LiDAR對台灣台北市六個交通要點收集資料,其中包含點雲及影像資料,標註的物件無汽車、機車、行人以及大型車輛,共有18270個物件。
在建造資料集的過程中最耗時的流程為資料標註的步驟,現今有許多輔助標註工具,大多都是利用點雲標註完的結果投影至影像上來減少標註時間,但對於使用者而言,影像比點雲相對直觀許多,且2D物件辨識以發展得非常成熟,若能將影像標註完的標註框投影至點雲中則能實現自動標註的功能,因此本研究提出一基於路面有條件之2D投影至3D的半自動輔助標註工具來提升資料集建置的速度,經過本研究的實驗比對,此半自動輔助標註工具能提升將近40%的標註效率。
我們使用三種不同的3D物件辨識模型來訓練本研究的資料集,以建立本資料集3D物件辨識任務的基準,並驗證本資料集的可信度,經實驗結果所得,本研究在汽車、機車及行人分別達到74.76%、85.41%及47.79%的準確率,此外利用路側單元固定畫面的特性,提出一資料前處理方法來提升辨識結果。


The quality of a dataset can greatly affect the accuracy of deep learning model. If the data set is incomplete or is very different from the using environment, it will often cause errors in the result. In the traffic object dataset, most of the datasets are from abroad, but the transportation environment in foreign countries is very different from that in Taiwan. For example, foreign countries mainly use automobiles as the main transportation objects, while Taiwan is mainly used motorcycles. Using these datasets to train and test in Taiwan will cause detection errors, but unfortunately there are very few traffic object datasets in Taiwan. Therefore, this study proposes a 3D dataset of roadside units in Taiwan, collecting data from six traffic points in Taipei, Taiwan through cameras and LiDAR, including point cloud and image data. The labeled objects include cars, cyclists, pedestrians, and cart, totaling 18270 Objects.
The most time-consuming process of building a dataset is data labeling. Currently, there are many annotation tools. Most of them use point cloud labeled results to project into images to reduce labeling time. However, for users, images are much more intuitive than point clouds, which is easier to label in image. Therefore, this study proposes a semi-automatic labeling tool based on the 2D projection to 3D of the road surface to speed up the dataset building. Through our experiment, the semi-automatic labeling tool can improve the labeling efficiency by nearly 40%.
In addition, we use three different 3D object recognition models to train our dataset to establish a benchmark for the 3D object detection task of this dataset, also verify the credibility of our dataset. We got 74.76%, 85.41%, and 47.79% accuracy rates in car, cyclist, and pedestrian. In addition, we propose a data preprocess method through the characteristics of roadside units to improve the detection accuracy.

指導教授推薦書 i 口試委員會審定書 ii 摘要 iii Abstract iv List of Tables vii List of Figures ix Nomenclature x Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Purpose 2 1.3 Literature Review 3 1.3.1 Dataset 3 1.3.2 Label Tools 5 1.3.3 Object Detection 6 1.4 Organization of the Thesis 8 Chapter 2 System Architecture and Research Methods 9 2.1 System Introduction 9 2.2 LiDAR 13 2.3 Data Labeling Tool Interface 14 2.4 3D Object Detection 16 Chapter 3 Dataset Collection and Annotation 17 3.1 Introduction of Roadside Unit Dataset 17 3.2 Data Collection 18 3.2.1 Collection Location 18 3.2.2 Data Collection Process 25 3.2.3 Sensor Calibration 26 3.3 Data Annotation 28 3.3.1 Semiautomatic Labeling Tool 28 3.3.2 2D Object Detection 29 3.3.3 2D to 3D Projection 30 3.3.4 Road Surface Correction 31 3.4 Dataset Format 32 3.4.1 Data Format 32 3.4.2 Road Surface Label 33 Chapter 4 Object Detection Model and Evaluation 34 4.1 Object Detection Model 34 4.1.1 PointRCNN 34 4.1.2 PointPillars 35 4.1.3 PV-RCNN 36 4.2 Evaluation 37 4.3 Data Preprocessing for Roadside Unit 39 4.3.1 Background Filter 39 4.3.2 Foreground and Background Classification 40 Chapter 5 Experiments and Results 42 5.1 Experiment 1: Semiautomatic Labeling Experiment 44 5.2 Experiment 2: 3D Object Detection Benchmark 48 5.3 Experiment 3: Data Preprocessing for Roadside Unit 55 5.4 Experiment 4: Integrated Datasets Training 56 Chapter 6 Conclusions and Future Work 60 6.1 Conclusions 60 6.2 Future Work 61 References 62

[1] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The KITTI dataset,” International Journal of Robotics Research, vol. 32, no. 11, pp. 1231-1237, Sep. 2013.
[2] H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. Erin Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “NuScenes: A multimodal dataset for autonomous driving,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 11621-11631, 2020.
[3] P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, P. Tsui, J. Guo, Y. Zhou, Y. Chai, B. Caine, V. Vasudevan, W. Han, J. Ngiam, H. Zhao, A. Timofeev, S. Ettinger, M. Krivokon, A. Gao, A. Joshi, Y. Zhang, J. Shlens, Z. Chen, and D. Anguelov,“Scalability in perception for autonomous driving: Waymo open dataset,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 2446-2454, 2020.
[4] S. Shi, X. Wang, and H. Li, “PointRCNN: 3D object proposal generation and detection from point cloud,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, USA, pp. 770-779, 2019
[5] S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, “PV-RCNN: Point-voxel feature set abstraction for 3D object detection,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 10529-10538, 2020.
[6] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “PointPillars: Fast encoders for object detection from point clouds,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, USA, pp. 12697-12705, 2019.
[7] A. Patil, S. Malla, H. Gang, and Y.-T. Chen, “The H3D dataset for full-surround 3D multi-object detection and tracking in crowded urban scenes,” International Conference on Robotics and Automation (ICRA), Montreal, Canada, pp. 9552-9557, 2019.
[8] M.-F. Chang, J. Lambert, P. Sangkloy, J. Singh, S. Bak, A. Hartnett, D. Wang, P. Carr, S. Lucey, D. Ramanan, and J. Hays, “Argoverse: 3d tracking and forecasting with rich maps,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, USA, pp. 8748-8757, 2019.
[9] Y. Wang, X. Chen, Y. You, L. E. Li, B. Hariharan, M. Campbell, K. Q. Weinberger, and W. L. Chao, “Train in germany, test in the usa: Making 3d object detectors generalize,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 11713-11723, 2020.
[10] B. C. Russell, A. Torralba, K. P. Murphy and W. T. Freeman, "LabelMe: A Database and Web-Based Tool for Image Annotation," International journal of computer vision, vol. 77, no. 1, pp. 157-173, 2008
[11] X. Huang, P. Wang, X. Cheng, D. Zhou, Q. Geng, and R. Yang, “The apolloscape dataset for autonomous driving,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Salt Lake City, Utah, USA, pp. 954-960, 2018.
[12] W. Zimmer, A. Rangesh and M. Trivedi, "3D BAT: A Semi-Automatic, Web-based 3D Annotation Toolbox for Full-Surround, Multi-Modal Data Streams," IEEE Intelligent Vehicles Symposium (IV), Paris, France, pp. 1816-1821, 2019.
[13] J. Redmon and A. Farhadi. “Yolov3: An incremental improvement.” 2018; [http://arxiv.org/abs/1804.02767 arXiv:1804.02767].
[14] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” European conference on computer vision. Springer, Cham, Amsterdam, Netherlands, pp. 21-37, 2016.
[15] K. HE, G. Gkioxari, P. Dollar, and R. Girshick, “Mask r-cnn,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, pp. 2961-2969, 2017.
[16] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “Yolact: Real-time instance segmentation,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, USA, pp. 9157-9166, 2019.
[17] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, pp. 652-660, 2017.
[18] J. Zarzar, S. Giancola, and B. Ghanem, “PointRGCN: Graph convolution networks for 3D vehicles detection refinement,” 2019; [http://arxiv.org/abs/1911.12236 arXiv:1911.12236].

[19] Z. Yang, Y. Sun, S. Liu, and J. Jia, “3DSSD: Point-based 3d single stage object detector,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 11040-11048, 2020.
[20] Y. Zhou, and O. Tuzel, “Voxelnet: End-to-end learning for point cloud based 3d object detection,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, USA, pp. 4490-4499, 2018.
[21] Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,” Sensors, vol. 18, no. 10, pp. 1-17, 2018.
[22] A. Mousavian, D. Anguelov, J. Flynn, and J. Kosecka, “3d bounding box estimation using deep learning and geometry,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, pp. 7074-7082, 2017.
[23] S. Vora, A. H. Lang, B. Helou, and O. Beijbom, “Pointpainting: Sequential fusion for 3d object detection,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 4604-4612, 2020.
[24] V. Lepetit, F. Moreno-Noguer, and P. Fua, “EPnP: An accurate O(n) solution to the PnP problem,” International journal of computer vision, vol. 81, no. 2, pp. 155–166, 2009.
[25] T. Y. Lin, P. Goyal, R. Girshick, K. He and P. Dollár, “Focal loss for dense object detection,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, pp. 2980-2988, 2017.

無法下載圖示 全文公開日期 2024/07/10 (校內網路)
全文公開日期 本全文未授權公開 (校外網路)
全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
QR CODE