研究生: |
邱勇智 Yung-Chih Chiu |
---|---|
論文名稱: |
台灣路側單元 3D 資料集建置及驗證 Creation and Validation of Taiwan’s Roadside Unit 3D Dataset in Taiwan |
指導教授: |
郭重顯
Chung-Hsien Kuo |
口試委員: |
黃漢邦
Han-Pang Huang 林其禹 Chyi-Yeu Lin 陸敬互 Ching-Hu Lu |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 英文 |
論文頁數: | 77 |
中文關鍵詞: | 3D物件辨識 、LiDAR 、深度神經網路 、點雲資料集 |
外文關鍵詞: | 3D object detection, LiDAR, Deep neural network, Point cloud dataset |
相關次數: | 點閱:241 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
一個資料集的好壞能大大的影響深度學習模型訓練出來的準確率,若資料集不完全或是與使用環境差異很大,常常會造成模型辨識的錯誤。在交通物件資料集中,大部分的資料集都是來自國外,但國外的交通環境與台灣的相差很大,例如國外主要是以汽車為主要交通物件,而台灣則是以機車為主,使用這些資料集來訓練在台灣的環境下測試就會有環境差異大造成辨識錯誤,但可惜的是台灣的交通物件資料集很少。因此本論文提出一台灣路側單元3D資料集,透過相機與LiDAR對台灣台北市六個交通要點收集資料,其中包含點雲及影像資料,標註的物件無汽車、機車、行人以及大型車輛,共有18270個物件。
在建造資料集的過程中最耗時的流程為資料標註的步驟,現今有許多輔助標註工具,大多都是利用點雲標註完的結果投影至影像上來減少標註時間,但對於使用者而言,影像比點雲相對直觀許多,且2D物件辨識以發展得非常成熟,若能將影像標註完的標註框投影至點雲中則能實現自動標註的功能,因此本研究提出一基於路面有條件之2D投影至3D的半自動輔助標註工具來提升資料集建置的速度,經過本研究的實驗比對,此半自動輔助標註工具能提升將近40%的標註效率。
我們使用三種不同的3D物件辨識模型來訓練本研究的資料集,以建立本資料集3D物件辨識任務的基準,並驗證本資料集的可信度,經實驗結果所得,本研究在汽車、機車及行人分別達到74.76%、85.41%及47.79%的準確率,此外利用路側單元固定畫面的特性,提出一資料前處理方法來提升辨識結果。
The quality of a dataset can greatly affect the accuracy of deep learning model. If the data set is incomplete or is very different from the using environment, it will often cause errors in the result. In the traffic object dataset, most of the datasets are from abroad, but the transportation environment in foreign countries is very different from that in Taiwan. For example, foreign countries mainly use automobiles as the main transportation objects, while Taiwan is mainly used motorcycles. Using these datasets to train and test in Taiwan will cause detection errors, but unfortunately there are very few traffic object datasets in Taiwan. Therefore, this study proposes a 3D dataset of roadside units in Taiwan, collecting data from six traffic points in Taipei, Taiwan through cameras and LiDAR, including point cloud and image data. The labeled objects include cars, cyclists, pedestrians, and cart, totaling 18270 Objects.
The most time-consuming process of building a dataset is data labeling. Currently, there are many annotation tools. Most of them use point cloud labeled results to project into images to reduce labeling time. However, for users, images are much more intuitive than point clouds, which is easier to label in image. Therefore, this study proposes a semi-automatic labeling tool based on the 2D projection to 3D of the road surface to speed up the dataset building. Through our experiment, the semi-automatic labeling tool can improve the labeling efficiency by nearly 40%.
In addition, we use three different 3D object recognition models to train our dataset to establish a benchmark for the 3D object detection task of this dataset, also verify the credibility of our dataset. We got 74.76%, 85.41%, and 47.79% accuracy rates in car, cyclist, and pedestrian. In addition, we propose a data preprocess method through the characteristics of roadside units to improve the detection accuracy.
[1] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The KITTI dataset,” International Journal of Robotics Research, vol. 32, no. 11, pp. 1231-1237, Sep. 2013.
[2] H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. Erin Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “NuScenes: A multimodal dataset for autonomous driving,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 11621-11631, 2020.
[3] P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, P. Tsui, J. Guo, Y. Zhou, Y. Chai, B. Caine, V. Vasudevan, W. Han, J. Ngiam, H. Zhao, A. Timofeev, S. Ettinger, M. Krivokon, A. Gao, A. Joshi, Y. Zhang, J. Shlens, Z. Chen, and D. Anguelov,“Scalability in perception for autonomous driving: Waymo open dataset,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 2446-2454, 2020.
[4] S. Shi, X. Wang, and H. Li, “PointRCNN: 3D object proposal generation and detection from point cloud,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, USA, pp. 770-779, 2019
[5] S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, “PV-RCNN: Point-voxel feature set abstraction for 3D object detection,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 10529-10538, 2020.
[6] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “PointPillars: Fast encoders for object detection from point clouds,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, USA, pp. 12697-12705, 2019.
[7] A. Patil, S. Malla, H. Gang, and Y.-T. Chen, “The H3D dataset for full-surround 3D multi-object detection and tracking in crowded urban scenes,” International Conference on Robotics and Automation (ICRA), Montreal, Canada, pp. 9552-9557, 2019.
[8] M.-F. Chang, J. Lambert, P. Sangkloy, J. Singh, S. Bak, A. Hartnett, D. Wang, P. Carr, S. Lucey, D. Ramanan, and J. Hays, “Argoverse: 3d tracking and forecasting with rich maps,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, USA, pp. 8748-8757, 2019.
[9] Y. Wang, X. Chen, Y. You, L. E. Li, B. Hariharan, M. Campbell, K. Q. Weinberger, and W. L. Chao, “Train in germany, test in the usa: Making 3d object detectors generalize,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 11713-11723, 2020.
[10] B. C. Russell, A. Torralba, K. P. Murphy and W. T. Freeman, "LabelMe: A Database and Web-Based Tool for Image Annotation," International journal of computer vision, vol. 77, no. 1, pp. 157-173, 2008
[11] X. Huang, P. Wang, X. Cheng, D. Zhou, Q. Geng, and R. Yang, “The apolloscape dataset for autonomous driving,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Salt Lake City, Utah, USA, pp. 954-960, 2018.
[12] W. Zimmer, A. Rangesh and M. Trivedi, "3D BAT: A Semi-Automatic, Web-based 3D Annotation Toolbox for Full-Surround, Multi-Modal Data Streams," IEEE Intelligent Vehicles Symposium (IV), Paris, France, pp. 1816-1821, 2019.
[13] J. Redmon and A. Farhadi. “Yolov3: An incremental improvement.” 2018; [http://arxiv.org/abs/1804.02767 arXiv:1804.02767].
[14] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” European conference on computer vision. Springer, Cham, Amsterdam, Netherlands, pp. 21-37, 2016.
[15] K. HE, G. Gkioxari, P. Dollar, and R. Girshick, “Mask r-cnn,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, pp. 2961-2969, 2017.
[16] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “Yolact: Real-time instance segmentation,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, California, USA, pp. 9157-9166, 2019.
[17] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, pp. 652-660, 2017.
[18] J. Zarzar, S. Giancola, and B. Ghanem, “PointRGCN: Graph convolution networks for 3D vehicles detection refinement,” 2019; [http://arxiv.org/abs/1911.12236 arXiv:1911.12236].
[19] Z. Yang, Y. Sun, S. Liu, and J. Jia, “3DSSD: Point-based 3d single stage object detector,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 11040-11048, 2020.
[20] Y. Zhou, and O. Tuzel, “Voxelnet: End-to-end learning for point cloud based 3d object detection,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, USA, pp. 4490-4499, 2018.
[21] Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,” Sensors, vol. 18, no. 10, pp. 1-17, 2018.
[22] A. Mousavian, D. Anguelov, J. Flynn, and J. Kosecka, “3d bounding box estimation using deep learning and geometry,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, pp. 7074-7082, 2017.
[23] S. Vora, A. H. Lang, B. Helou, and O. Beijbom, “Pointpainting: Sequential fusion for 3d object detection,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, USA, pp. 4604-4612, 2020.
[24] V. Lepetit, F. Moreno-Noguer, and P. Fua, “EPnP: An accurate O(n) solution to the PnP problem,” International journal of computer vision, vol. 81, no. 2, pp. 155–166, 2009.
[25] T. Y. Lin, P. Goyal, R. Girshick, K. He and P. Dollár, “Focal loss for dense object detection,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, pp. 2980-2988, 2017.