簡易檢索 / 詳目顯示

研究生: 吳曼琳
Man-Lin Wu
論文名稱: 基於毫米波雷達與相機的跨模態物件偵測
Cross-Modality Object Detection Based on mmWave Radar and Camera
指導教授: 陳永耀
Yung-Yao Chen
口試委員: 林昌鴻
Chang-Hong Lin
沈中安
Chung-An Shen
呂政修
Jenq-Shiou Leu
黃正民
Cheng-Ming Huang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 61
中文關鍵詞: 毫米波雷達跨模態物件偵測光照感知網路
外文關鍵詞: mmWave Radar, Cross-modality object detection, Illumination- aware network
相關次數: 點閱:334下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

為了實現高準確度和穩定度的自動駕駛系統,使用多感測器進行融合已成為不可或缺的手段。本研究提出一種基於毫米波雷達與相機之跨模態物件偵測方法,以應對相機影像容易受到光照或天氣等不良影響的問題,以及毫米波雷達缺乏紋理特徵的限制。為了克服這些問題,本研究利用兩個感測器間的互補性,結合了相機紋理與雷達深度資訊生成雷達深度圖,並設計了雙主幹網路,個別提取相機影像與雷達深度圖的特徵資訊,另外,引入了光照感知網路,評估相機影像中的光照權重,並根據此權重調整不同模態的貢獻度,從而實現更有效地融合。本研究使用 NuScenes 大型資料庫進行訓練與驗證,其平均準確度達到 86.7%,在效率方面則可以達到 75 FPS,且在不同場景下皆有穩定的效果。本研究提出的方法結合了毫米波雷達和相機的優勢,充分利用兩者的互補性,為自動駕駛系統提供了更可靠和準確的物件偵測能力。


To achieve a highly accurate and robust autonomous driving system, the fusion of multi-sensor has become an essential approach. This study proposes a cross-modality object detection based on mmWave radar and camera to address the limitations of cameras being affected by lighting or weather conditions, as well as the lack of texture features in mmWave radar. To overcome these challenges, this study leverages the complementarity between the two sensors by combining camera texture and radar depth information to generate a radar depth map. Dual backbones are designed to individually extract feature information from camera images and radar depth maps. Additionally, an illumination-aware network is introduced to evaluate the lighting weights in camera images and adjust the contributions of different modalities based on these weights, enabling more effective fusion. The training and evaluation were conducted using the NuScenes datasets, achieving an average accuracy of 86.7% and an efficiency of 75 FPS. It demonstrates stable performance across different scenarios. The proposed method combines the strengths of mmWave radar and camera, fully exploiting their complementarity, and provides a more reliable and accurate object detection capability for autonomous driving systems.

指導教授推薦書 I 考試委員審定書 I 致謝 I 摘要 II Abstract III 目錄 IV 圖目錄 VI 表目錄 VIII 第一章緒論 1 1.1 前言 1 1.2 研究動機 3 1.2.1 模態輸入通道呈現方式 3 1.2.2 多模態融合方法 5 1.3 論文貢獻 6 第二章 相關文獻 7 2.1 基於影像的物件偵測 7 2.2 基於雷達與影像融合的物件偵測 12 2.2.1 早期融合 12 2.2.2 晚期融合 13 2.2.3 中期融合 14 第三章方法 17 3.1 雷達數據預處理 17 3.1.1 擴展雷達高度 19 3.1.2 生成雷達深度圖 20 3.2 模型架構 23 3.3 子網路架構 25 3.3.1 雙主幹網路 26 3.3.2 中間層 27 3.3.3 光照感知子網路 30 第四章 實驗結果與分析 31 4.1 實驗環境 31 4.2 資料集 32 4.3 模型參數設置 34 4.4 效能評估 35 4.5 消融實驗 37 4.5.1 融合方法比較 37 4.5.2 雷達輸入通道呈現方式 39 4.6 實驗結果 41 第五章 結論與未來展望 46 參考文獻 47

[1] Y. Wang, Z. Jiang, Y. Li, J. -N. Hwang, G. Xing and H. Liu, "RODNet: A Real-Time Radar Object Detection Network Cross-Supervised by Camera-Radar Fused Object 3D Localization," in IEEE Journal of Selected Topics in Signal Processing, 2021, pp. 954-967.

[2] H. Caesar, Bankiti, V., Lang, A. H., Vora, S., Liong, V. E., Xu, Q., Krishnan, A., Pan, Y., Baldan, G. and O. Beijbom, "NuScenes: A multimodal dataset for autonomous driving," in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 11618–11628.

[3] S. Chadwick, W. Maddern, and P. Newman, "Distant vehicle detection using radar and vision," in International Conference on Robotics and Automation (ICRA), 2019, pp. 8311-8317.

[4] F. A. Jibrin, Z. Deng and Y. Zhang, "An Object Detection and Classification Method using Radar and Camera Data Fusion," in IEEE International Conference on Signal, Information and Data Processing (ICSIDP), 2019, pp. 1-6.

[5] V. John and S. Mita, "RVNet: Deep sensor fusion of monocular camera and radar for image-based obstacle detection in challenging environments," in Pacific-Rim Symposium on Image and Video Technology, 2019, pp. 351–364.

[6] R. Yadav, A. Vierling and K. Berns, "Radar + RGB Fusion For Robust Object Detection In Autonomous Vehicle," in IEEE International Conference on Image Processing (ICIP), 2020, pp. 1986-1990.

[7] S. Chang, Y. Zhang, F. Zhang, X. Zhao, S. Huang, Z. Feng and Z. Wei, "Spatial Attention Fusion for Obstacle Detection Using MmWave Radar and Vision Sensor," Sensors, 20(4):956, 2020.

[8] F. Nobis, M. Geisslinger, M. Weber, J. Betz, and M. Lienkamp, "A deep learning-based radar and camera sensor fusion architecture for object detection," in Sensor Data Fusion: Trends, Solutions, Applications (SDF), 2019, pp. 1-7.

[9] L. -q. Li and Y. -l. Xie, "A Feature Pyramid Fusion Detection Algorithm Based on Radar and Camera Sensor," in IEEE International Conference on Signal Processing (ICSP), 2020, pp. 366-370.

[10] L. Stäcker, P. Heidenreich, J. Rambach and D. Stricker, "Fusion Point Pruning for Optimized 2D Object Detection with Radar-Camera Fusion," in IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 1275-1282.

[11] Y. Song, Z. Xie, X. Wang and Y. Zou, "MS-YOLO: Object Detection Based on YOLOv5 Optimized Fusion Millimeter-Wave Radar and Machine Vision," in IEEE Sensors Journal, 2022, pp. 15435-15447.

[12] R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2014, pp. 580-587.

[13] R. Girshick, "Fast R-CNN," in IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1440-1448.

[14] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, pp. 1137-1149.

[15] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779-788.

[16] J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6517-6525.

[17] J. Redmon and A. Farhadi, "YOLOv3: An incremental improvement," in arXiv:1804.02767, 2018.

[18] T. -Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie, "Feature Pyramid Networks for Object Detection," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 936-944.

[19] A. Bochkovskiy, C.-Y. Wang, and H.Y.M. Liao, "YOLOv4: Optimal speed and accuracy of object detection," in arXiv:2004.10934, 2020.

[20] J. Glenn, "YOLOv5," https://github.com/ultralytics/yolov5/releases/tag/v5.0, 2022.

[21] G. Alessandretti, A. Broggi and P. Cerri, "Vehicle and Guard Rail Detection Using Radar and Vision Data Fusion," in IEEE Transactions on Intelligent Transportation Systems, 2007, pp. 95-105.

[22] U. Kadow, G. Schneider and A. Vukotich, "Radar-Vision Based Vehicle Recognition with Evolutionary Optimized and Boosted Features," in IEEE Intelligent Vehicles Symposium, 2007, pp. 749-754.

[23] A. Haselhoff, A. Kummert and G. Schneider, "Radar-vision fusion for vehicle detection by means of improved haar-like feature and AdaBoost approach," in 15th European Signal Processing Conference, 2007, pp. 2070-2074.

[24] Z. Ji and D. Prokhorov, "Radar-vision fusion for object classification," in 11th International Conference on Information Fusion, 2008, pp. 1-7.

[25] X.-P. Guo, J.-S. Du, J. Gao and W. Wang, "Pedestrian Detection Based on Fusion of Millimeter Wave Radar and Vision," in AIPR 2018: Proceedings of the 2018 International Conference on Artificial Intelligence and Pattern Recognition, 2018, pp. 38-42.

[26] K. Kowol, M. Rottmann, S. Bracke and H. Gottschalk, "YOdar: Uncertainty-based Sensor Fusion for Vehicle Detection with Camera and Radar Sensors," in arXiv:2010.03320, 2020.

[27] A. Levin, D. Lischinski and Y. Weiss, "Colorization using optimization," ACM Trans, 2004, pp. 689–694.

[28] S. Nathan, H. Derek, K. Pushmeet and F. Rob, "Indoor segmentation and support inference from RGBD images," in ECCV’12 Proceedings of the 12th European conference on Computer Vision, 2012, pp. 746–760.

[29] M. Schwarz, H. Schulz and S. Behnke, "RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features," in IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 1329-1335.

[30] S. A. Siddiqui, A. Vierling and K. Berns, "Multi-Modal Depth Estimation Using Convolutional Neural Networks," in IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), 2020, pp. 354-359.

[31] C. -C. Lo and P. Vandewalle, "Depth Estimation From Monocular Images And Sparse Radar Using Deep Ordinal Regression Network," in IEEE International Conference on Image Processing (ICIP), 2021, pp. 3343-3347.

[32] L. Tang, J. Yuan, H. Zhang, X. Jiang, and J. Ma, "PIAFusion: A progressive infrared and visible image fusion network based on illumination aware," Inf. Fusion, 2022, pp. 79–92.

[33] R. Nabati and H. Qi, "RRPN: Radar Region Proposal Network for Object Detection in Autonomous Vehicles," in IEEE International Conference on Image Processing (ICIP), 2019, pp. 3093-3097.

[34] R. Nabati and H. Qi, "Radar-Camera Sensor Fusion for Joint Object Detection and Distance Estimation in Autonomous Vehicles," in arXiv:2009.08428, 2020.

[35] T. -Y. Lin, P. Goyal, R. Girshick, K. He and P. Dollár, "Focal Loss for Dense Object Detection," in IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2999-3007.

無法下載圖示 全文公開日期 2025/08/07 (校內網路)
全文公開日期 2028/08/07 (校外網路)
全文公開日期 2028/08/07 (國家圖書館:臺灣博碩士論文系統)
QR CODE