簡易檢索 / 詳目顯示

研究生: 黃浩瑋
Hao-Wei Hwang
論文名稱: 基於光達體素特徵的改良式三維小物件偵測
Improved 3D Small Object Detection Based on LiDAR Voxel Features
指導教授: 陳永耀
Yung-Yao Chen
口試委員: 陳維美
Wei-Mei Chen
阮聖彰
Shanq-Jang Ruan
李佩君
Pei-Jun Lee
林淵翔
Yuan-Hsiang Lin
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 55
中文關鍵詞: 光達點雲三維物件偵測小物件偵測
外文關鍵詞: lidar point cloud, 3d object detection, small object detection
相關次數: 點閱:212下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,自駕車的發展越來越熱烈,其中關於自駕車的感知技術為當中十分重要的一環,因此基於光達的三維物件偵測技術任務已成為了如今熱門的研究主題之一,而根據光達點雲特徵的輸入呈現方式可將相關研究分為兩部分:基於點的方法、基於體素的方法,然而此兩種方法皆有其優缺點以及其擅長偵測的物件類別,為了達到更佳的偵測效能,有相關研究提出了基於點—體素的方法,即為同時利用上述兩種架構以導入兩種不同的特徵表示方式,此方法透過利用不同的特徵表示方式以達到互相彌補弱點的效果,並於各種物件類別皆表現良好,然而基於點—體素的方法仍有其弱點須進行優化,因加入了複數特徵的計算,模型的計算成本與記憶體成本皆明顯提升。為此,本研究提出了一個新穎的基於體素的模型,透過於三維骨幹網路中導入了前景體素機率以及基於三維精準結構資訊的雙流特徵轉換模塊,並修改了三維特徵感興趣區域的特徵池化篩選方式,本研究得以避免原始點雲輔助所需之額外計算,進而降低模型的計算成本與記憶體成本,並仍於各物件類別皆達到高效能的輸出結果。最終,本研究所提出之模型會基於 KITTI 資料集進行開發及驗證,而實驗結果也展現本研究之模型於 KITTI 資料集上針對各物件類別的良好表現。


    In recent years, autonomous vehicle development has gained momentum, with perception technologies playing a crucial role. This has led to extensive research on 3D LiDAR object detection techniques using point cloud data. These studies can be categorized into point-based and voxel-based methods, each with its own advantages, disadvantages, and object detection capabilities. To achieve superior performance, researchers have proposed a point-voxel fusion approach that combines the strengths of both methods, leveraging their complementary features. This approach has shown excellent performance across various object categories. However, it significantly increases computational and memory costs due to multiple feature computations. To overcome this challenge, we present a novel voxel-based model. By incorporating foreground voxel probability and 3D precise structural information guided two stream features transformation module into the 3D backbone network and modifying the selection method of the 3D region of interest pooling module, our model eliminates the need for additional computations on the original point cloud data. This results in reduced computational and memory costs while achieving high-performance outputs across all object categories. We develop and validate the model using the KITTI dataset, demonstrating its strong performance across different object categories within this dataset.

    指導教授推薦書 I 考試委員審定書 I 致謝 I 摘要 II Abstract III 目錄 IV 圖目錄 VI 表目錄 VII 第一章 緒論 1 1.1 前言 1 1.2 研究動機 3 1.3 論文貢獻 4 第二章 相關文獻 5 2.1 三維物件檢測器 5 2.1.1 基於點的方法 5 2.1.2 基於體素的方法 8 2.1.3. 基於點—體素的方法 11 2.2 三維感興趣區域池化 13 第三章 方法 15 3.1 模型架構 15 3.2 前景體素機率預測分支 17 3.3 三維體素骨幹網路 19 3.3.1 基於精準三維結構資訊之特徵轉換模塊20 3.3.2 結構—語意分割之雙流主幹網路21 3.4 基於前景機率之感興趣區域網格池化 22 3.5 損失函數 24 3.5.1 RPN 部分之損失函數.24 3.5.2 RCNN 部分之損失函數25 3.5.3 整體損失函數25 第四章 實驗結果與分析 27 4.1 實驗環境 27 4.2 資料集說明 28 4.3 模型參數設置 29 4.4 效能評估指標 30 4.5 實驗結果 31 4.5.1 實驗結果呈現方式說明.31 4.5.2 實驗結果分析.31 4.6 相關消融實驗 35 第五章 結論與未來展望.40 參考文獻.41

    [1] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, "PointNet: Deep learning on point sets for 3d classification and segmentation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017, 652-660.
    [2] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, "PointNet++: Deep hierarchical feature learning on point sets in a metric space," in Advances in Neural Information Processing Systems, 2017, 5099-5108.
    [3] S. Shi, X. Wang, and H. Li, "PointRCNN: 3d object proposal generation and detection from point cloud," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, 770-779.
    [4] Z. Yang, Y. Sun, S. Liu, and J. Jia, "3DSSD: Point-based 3d single stage object detector," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, 11040-11048.
    [5] C. Chen, Z. Chen, J. Zhang, and D. Tao, "SASA: Semantics-augmented set abstraction for point-based 3d object detection," in Proceedings of the AAAI Conference on Artificial Intelligence, 36(1), 2022, 221-229.
    [6] Y. Zhang, Q. Hu, G. Xu, Y. Ma, J. Wan, and Y. Guo, "Not all points are equal: Learning highly efficient point-based detectors for 3d lidar point clouds," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 18953-18962.
    [7] Y. Zhou and O. Tuzel, "VoxelNet: End-to-end learning for point cloud based 3d object detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, 4490-4499.
    [8] Y. Yan, Y. Mao, and B. Li, "SECOND: Sparsely embedded convolutional detection," in Sensors, 18(10), 2018, 3337.
    [9] S. Shi, Z. Wang, J. Shi, X. Wang, and H. Li, "From points to parts: 3d object detection from point cloud with part-aware and part-aggregation network," in IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(8), 2020, 2647-2664.
    [10] H. Yi, S. Shi, M. Ding, J. Sun, K. Xu, H. Zhou, Z. Wang, S. Li, and G. Wang, "SegVoxelNet: Exploring semantic context and depth-aware features for 3d vehicle detection from point cloud," in Proceedings of the IEEE International Conference on Robotics and Automation, IEEE, 2020, 2274-2280.
    [11] W. Zheng, W. Tang, S. Chen, L. Jiang, and C.-W. Fu, "CIA-SSD: Confident iou-aware single-stage object detector from point cloud," in Proceedings of the AAAI Conference on Artificial Intelligence, 35(4), 2021, 3555-3562.
    [12] J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang, and H. Li, "Voxel R-CNN : Towards high performance voxel-based 3d object detection," in Proceedings of the AAAI Conference on Artificial Intelligence, 35(2), 2021, 1201-1209.
    [13] Y. Chen, Y. Li, X. Zhang, J. Sun, and J. Jia, "Focal sparse convolutional networks for 3d object detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 5428- 5437.
    [14] S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, "PVRCNN: Point-voxel feature set abstraction for 3d object detection". in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, 10529-10538.
    [15] J. Mao, M. Niu, H. Bai, X. Liang, H. Xu, and C. Xu, "Pyramid R-CNN: Towards better performance and adaptability for 3d object detection," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, 2723-2732.
    [16] H. Sheng, S. Cai, Y. Liu, B. Deng, J. Huang, X.-S. Hua, and M.-J. Zhao, "Improving 3d object detection with channel-wise transformer," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, 2743-2752.
    [17] X. Wang, K. Yu, C. Dong, and C. C. Loy, "Recovering realistic texture in image super-resolution by deep spatial feature transform," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, 606-615.
    [18] J. Gu, H. Lu, W. Zuo, and C. Dong, "Blind super-resolution with iterative kernel correction," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, 1604-1613.
    [19] X. Wang, Y. Li, H. Zhang, and Y. Shan, "Towards real-world blind face restoration with generative facial prior," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, 9168- 9178.
    [20] R. Girshick, "Fast R-CNN," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2015, 1440-1448.
    [21] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards RealTime Object Detection with Region Proposal Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 2017, 1137-1149.
    [22] K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask R-CNN," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2017, 2961-2969.
    [23] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, "Focal loss for dense object detection," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2017, 2980-2988.
    [24] A. Geiger, P. Lenz, and R. Urtasun, "Are we ready for autonomous driving? the kitti vision benchmark suite," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2012, 3354-3361.
    [25] O. D. Team, OpenPCDet: An Open-source Toolbox for 3D Object Detection from Point Clouds, https://github.com/open- mmlab/OpenPCDet.
    [26] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," in arXiv 1412.6980, 2014.
    [27] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, "PointPillars: Fast encoders for object detection from point clouds," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, 12697-12705.
    [28] C. He, H. Zeng, J. Huang, X.-S. Hua, and L. Zhang, "Structure aware single-stage 3d object detection from point cloud," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, 11873-11882.
    [29] Z. Liu, X. Zhao, T. Huang, R. Hu, Y. Zhou, and X. Bai, "TANet: Robust 3d object detection from point clouds with triple attention," in Proceedings of the AAAI Conference on Artificial Intelligence, 34(7), 2020, 11677-11684.
    [30] J. S. Hu, T. Kuai, and S. L. Waslander, "Point density-aware voxels for lidar 3d object detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 8469-8478.

    無法下載圖示 全文公開日期 2025/08/07 (校內網路)
    全文公開日期 2028/08/07 (校外網路)
    全文公開日期 2028/08/07 (國家圖書館:臺灣博碩士論文系統)
    QR CODE