研究生: |
沈子皓 Zi-Hao Shen |
---|---|
論文名稱: |
通過基於 Swin Transformer 的神經網路和 CNN 集成增強雷達物件偵測 Enhancing Radar Object Detection through Swin Transformer-based Neural Networks and CNN Integration |
指導教授: |
沈上翔
Shan-Hsiang Shen |
口試委員: |
花凱龍
Kai-Lung Hua 沈上翔 Shan-Hsiang Shen 陳永耀 Yung-Yao Chen |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2023 |
畢業學年度: | 112 |
語文別: | 英文 |
論文頁數: | 40 |
中文關鍵詞: | 雷達物體檢測 、深度學習 、卷積 、注意力機制 、基於 Swin 變壓器的模型 、雷達後處理 |
外文關鍵詞: | Radar Object Detection, Deep Learning, Convolution, Attention Mechanism, Swin Transformer-Based Model, Radar Post-Processing |
相關次數: | 點閱:42 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
[1] T. Jiang, L. Zhuang, Q. An, J. Wang, K. Xiao, and A. Wang, “T-rodnet: Transformer for vehicular millimeter-wave radar object detection,” IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1–12, 2022.
[2] D. D. Thang, S. C. Hidayati, Y.-Y. Chen, W.-H. Cheng, S.-W. Sun, and K.-L. Hua, “A spatial-pyramid scene categorization algorithm based on locality-aware sparse coding,” in 2016 IEEE Second International Conference on Multimedia Big Data (BigMM), pp. 342–345, 2016.
[3] C.-L. Yang, H. Tampubolon, A. Setyoko, K.-L. Hua, M. Tanveer, and W. Wei, “Secure and privacy-preserving human interaction recognition of pervasive healthcare monitoring,” IEEE Transactions on Network Science and Engineering, vol. 10, no. 5, pp. 2439–2454, 2023.
[4] D. S. Tan, J. H. Soeseno, and K.-L. Hua, “Controllable and identity-aware facial attribute transformation,” IEEE Transactions on Cybernetics, vol. 52, no. 6, pp. 4825–4836, 2022.
[5] F. J. Abdu, Y. Zhang, M. Fu, Y. Li, and Z. Deng, “Application of deep learning on millimeter- wave radar signals: A review,” Sensors, vol. 21, no. 6, p. 1951, 2021.
[6] T. Yamaguchi and T. Mizutani, “Localization of subsurface pipes in radar images by 3d con- volutional neural network and kirchhoff migration,” in 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, pp. 4841–4844, IEEE, 2021.
[7] X. Ding, X. Zhang, J. Han, and G. Ding, “Scaling up your kernels to 31x31: Revisiting large kernel design in cnns,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11963–11975, 2022.
[8] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
[9] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[10] W. Zaremba, I. Sutskever, and O. Vinyals, “Recurrent neural network regularization,” arXiv preprint arXiv:1409.2329, 2014.
[11] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.
[12] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. De- hghani, M. Minderer, G. Heigold, S. Gelly, et al., “An image is worth 16x16 words: Trans- formers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
[13] M. Shahid and K.-L. Hua, “Fire detection using transformer network,” in Proceedings of the 2021 International Conference on Multimedia Retrieval, pp. 627–630, 2021.
[14] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022, 2021.
[15] M. Shahid, J. J. Virtusio, Y.-H. Wu, Y.-Y. Chen, M. Tanveer, K. Muhammad, and K.-L. Hua, “Spatio-temporal self-attention network for fire detection and segmentation in video surveillance,” IEEE Access, vol. 10, pp. 1259–1275, 2022.
[16] M. Shahid, I.-F. Chien, W. Sarapugdi, L. Miao, and K.-L. Hua, “Deep spatial-temporal networks for flame detection,” Multimedia Tools and Applications, vol. 80, pp. 35297–35318, 2021.
[17] M. Zhao, T. Li, M. A. Alsheikh, Y. Tian, H. Zhao, A. Torralba, and D. Katabi, “Through-wall human pose estimation using radio signals,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7356–7365, 2018.
[18] D. Misra, T. Nalamada, A. U. Arasanipalai, and Q. Hou, “Rotate to attend: Convolutional triplet attention module,” in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 3138–3147, 2021.
[19] Y. Wang, Z. Jiang, Y. Li, J.-N. Hwang, G. Xing, and H. Liu, “Rodnet: A real-time radar object detection network cross-supervised by camera-radar fused object 3d localization,” IEEE Journal of Selected Topics in Signal Processing, vol. 15, no. 4, pp. 954–967, 2021.
[20] M. A. Richards et al., Fundamentals of radar signal processing, vol. 1. Mcgraw-hill New York, 2005.
[21] Y. Wang, Y.-T. Huang, and J.-N. Hwang, “Monocular visual object 3d localization in road scenes,” in Proceedings of the 27th ACM International Conference on Multimedia, pp. 917– 925, 2019.