簡易檢索 / 詳目顯示

研究生: 沈子皓
Zi-Hao Shen
論文名稱: 通過基於 Swin Transformer 的神經網路和 CNN 集成增強雷達物件偵測
Enhancing Radar Object Detection through Swin Transformer-based Neural Networks and CNN Integration
指導教授: 沈上翔
Shan-Hsiang Shen
口試委員: 花凱龍
Kai-Lung Hua
沈上翔
Shan-Hsiang Shen
陳永耀
Yung-Yao Chen
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 112
語文別: 英文
論文頁數: 40
中文關鍵詞: 雷達物體檢測深度學習卷積注意力機制基於 Swin 變壓器的模型雷達後處理
外文關鍵詞: Radar Object Detection, Deep Learning, Convolution, Attention Mechanism, Swin Transformer-Based Model, Radar Post-Processing
相關次數: 點閱:42下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

  • Contents Recommendation Letter i ApprovalLetter ii AbstractinChinese iii AbstractinEnglish iv Acknowledgements v Contents vi List of Figures viii List of Tables xi 1 Introduction 1 2 RelatedWork 3 2.1 3-D CNN-Based Radar Object Detection 3 2.2 Transformer-BasedApproaches 4 3 Methodology 6 3.1 ModelArchitecture 6 3.2 M-Net Plus: Enhanced Radar Signal Processing in CRUW Dataset 7 3.3 Learnable Triplet Attention: Enhancing Feature Extraction Capabilities 11 3.4 L-NMS Plus 14 4 Experiment 17 4.1 Dataset 17 4.2 Implementation Details 21 4.3 Ablation Studies 23 4.3.1 M-Net Plus Module 24 4.3.2 Learnable Triplet Attention Module 26 4.3.3 M-Net Plus Module + Learnable Triplet Attention Module 28 4.3.4 L-NMS Plus Module 30 4.3.5 M-Net Plus Module + Learnable Triplet AttentionModule+L-NMSPlusModule 35 5 Conclusion 37 References 39

    [1] T. Jiang, L. Zhuang, Q. An, J. Wang, K. Xiao, and A. Wang, “T-rodnet: Transformer for vehicular millimeter-wave radar object detection,” IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1–12, 2022.
    [2] D. D. Thang, S. C. Hidayati, Y.-Y. Chen, W.-H. Cheng, S.-W. Sun, and K.-L. Hua, “A spatial-pyramid scene categorization algorithm based on locality-aware sparse coding,” in 2016 IEEE Second International Conference on Multimedia Big Data (BigMM), pp. 342–345, 2016.
    [3] C.-L. Yang, H. Tampubolon, A. Setyoko, K.-L. Hua, M. Tanveer, and W. Wei, “Secure and privacy-preserving human interaction recognition of pervasive healthcare monitoring,” IEEE Transactions on Network Science and Engineering, vol. 10, no. 5, pp. 2439–2454, 2023.
    [4] D. S. Tan, J. H. Soeseno, and K.-L. Hua, “Controllable and identity-aware facial attribute transformation,” IEEE Transactions on Cybernetics, vol. 52, no. 6, pp. 4825–4836, 2022.
    [5] F. J. Abdu, Y. Zhang, M. Fu, Y. Li, and Z. Deng, “Application of deep learning on millimeter- wave radar signals: A review,” Sensors, vol. 21, no. 6, p. 1951, 2021.
    [6] T. Yamaguchi and T. Mizutani, “Localization of subsurface pipes in radar images by 3d con- volutional neural network and kirchhoff migration,” in 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, pp. 4841–4844, IEEE, 2021.
    [7] X. Ding, X. Zhang, J. Han, and G. Ding, “Scaling up your kernels to 31x31: Revisiting large kernel design in cnns,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11963–11975, 2022.
    [8] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
    [9] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
    [10] W. Zaremba, I. Sutskever, and O. Vinyals, “Recurrent neural network regularization,” arXiv preprint arXiv:1409.2329, 2014.
    [11] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.
    [12] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. De- hghani, M. Minderer, G. Heigold, S. Gelly, et al., “An image is worth 16x16 words: Trans- formers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
    [13] M. Shahid and K.-L. Hua, “Fire detection using transformer network,” in Proceedings of the 2021 International Conference on Multimedia Retrieval, pp. 627–630, 2021.
    [14] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022, 2021.
    [15] M. Shahid, J. J. Virtusio, Y.-H. Wu, Y.-Y. Chen, M. Tanveer, K. Muhammad, and K.-L. Hua, “Spatio-temporal self-attention network for fire detection and segmentation in video surveillance,” IEEE Access, vol. 10, pp. 1259–1275, 2022.
    [16] M. Shahid, I.-F. Chien, W. Sarapugdi, L. Miao, and K.-L. Hua, “Deep spatial-temporal networks for flame detection,” Multimedia Tools and Applications, vol. 80, pp. 35297–35318, 2021.
    [17] M. Zhao, T. Li, M. A. Alsheikh, Y. Tian, H. Zhao, A. Torralba, and D. Katabi, “Through-wall human pose estimation using radio signals,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7356–7365, 2018.
    [18] D. Misra, T. Nalamada, A. U. Arasanipalai, and Q. Hou, “Rotate to attend: Convolutional triplet attention module,” in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 3138–3147, 2021.
    [19] Y. Wang, Z. Jiang, Y. Li, J.-N. Hwang, G. Xing, and H. Liu, “Rodnet: A real-time radar object detection network cross-supervised by camera-radar fused object 3d localization,” IEEE Journal of Selected Topics in Signal Processing, vol. 15, no. 4, pp. 954–967, 2021.
    [20] M. A. Richards et al., Fundamentals of radar signal processing, vol. 1. Mcgraw-hill New York, 2005.
    [21] Y. Wang, Y.-T. Huang, and J.-N. Hwang, “Monocular visual object 3d localization in road scenes,” in Proceedings of the 27th ACM International Conference on Multimedia, pp. 917– 925, 2019.

    無法下載圖示 全文公開日期 2029/01/30 (校內網路)
    全文公開日期 2034/01/30 (校外網路)
    全文公開日期 2034/01/30 (國家圖書館:臺灣博碩士論文系統)
    QR CODE