一種應用於無人機影像之無錨框多物件追蹤聯合訓練模型

簡易檢索 / 詳目顯示

回結果列表

研究生：	廖卲慈 Shao-Tsz Liao
論文名稱：	一種應用於無人機影像之無錨框多物件追蹤聯合訓練模型 An Anchor-Free Joint Model for Multiple Object Tracking in UAV Videos
指導教授：	阮聖彰 Shanq-Jang Ruan 林昌鴻 Chang Hong Lin
口試委員:	阮聖彰林昌鴻陳維美呂政修
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2023
畢業學年度：	112
語文別：	英文
論文頁數：	72
中文關鍵詞：	多物件追蹤、自動運輸、無錨點聯合訓練模型
外文關鍵詞：	Multiple Object Tracking (MOT) , Autonomous Transportation, Anchor-Free Joint Model
相關次數：	點閱：95 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

多目標追蹤（MOT）是智慧運輸系統中一項基本任務且重要的任務。其用於同時檢測和追蹤多個物體，具有諸多重要優勢，如確保安全、有效的交通管理和自動駕駛。傳統的多目標追蹤策略將其視為兩個連續的子任務；然而，將檢測和追蹤結合在一起，使模型能夠利用上下文信息並提高一致性。本文提出了一種新穎的多目標追蹤方法將目標檢測和追蹤整合至一個聯合模型中，同時消除了預定義的錨框（anchor boxes），以降低計算量和參數數量。該方法採用 HRNet 作為主幹網絡，並結合極化自注意來加強特徵提取。其設計解決了在車輛的視角下經常出現的雙向運動中追蹤物體的挑戰。根據在 Visdrone 及 UAVDT 數據集上進行的實驗表現，我們所提出的方法更適用於無人機影像，並超越目前現有的多目標追蹤，展現其競爭力。

Multiple object tracking (MOT) is a fundamental task that plays a crucial role in intelligent transportation systems. It refers to the process of simultaneously detecting and tracking multiple objects and offers numerous important benefits, such as ensuring safety, efficient traffic management, and autonomous driving. The raditional two-stage strategy solves MOT as two consecutive sub-tasks; however, combining detection and tracking enables the model to exploit contextual information and improves consistency. This thesis presents a novel MOT approach that integrates object detection and tracking into a joint model while eliminating predefined anchor boxes to reduce computation and parameters. HRNet is adopted as the backbone and integrated with polarized self-attention to consolidate the feature extraction process. This design addresses the challenges of tracking objects in bidirectional motion, which frequently occurs from the perspective of the vehicle. Experiments conducted on Visdrone and UVADT datasets demonstrate that our proposed method is more suitable for UAV videos and surpasses modern state-of-the-art trackers.

RECOMMENDATION FORM	I
COMMITTEE FORM	II
摘要	III
ABSTRACT	IV
ACKNOWLEDGEMENTS	V
TABLE OF CONTENTS	VII
LIST OF FIGURES	XI
LIST OF TABLES	XIV
CHAPTER 1	1
INTRODUCTION	1
1.1	Motivation of This Thesis	1
1.2	Purpose of this thesis	3
1.3	Contribution of This Thesis	4
1.4	Organization of This Thesis	5
CHAPTER 2	6
BACKGROUND	6
2.1	Multiple Object Tracking	7
2.2	Data Association	10
2.3	Convolutional Neural Networks	12
2.4	Attention Mechanism	14
CHAPTER 3	16
RELATED WORKS	16
3.1	Tracking-by-Detection	17
3.2	Joint Detection and Tracking	18
3.3	Impact of Anchors	19
3.4	MOT for UAV Videos	22
CHAPTER 4	23
PROPOSED ARCHITECTURE	23
4.1	Backbone Network	25
4.2	Polarized Self-attention	26
4.3	Output Branches	29
4.4	Data Association	33
CHAPTER 5	35
EXPERIMENTS	35
5.1	Dataset	36
5.2	Experiment Details	39
5.3	Dataset	40
5.4	Performance Comparison	41
5.5	Improvement Analysis	44
5.6	Qualitative Results	48
CHAPTER 6	51
CONCLUSION	51
REFERENCES	53
                                

[1] P. Voigtlaender, M. Krause, A. Osep, J. Luiten, B. B. G. Sekar, A. Geiger, and B. Leibe, “Mots: Multi-object tracking and segmentation,” in Proceedings of the ieee/cvf conference on computer vision and pattern recognition, 2019, pp. 7942–7951
[2] Z. Lu, V. Rathod, R. Votel, and J. Huang, “Retinatrack: Online single stage joint detection and tracking,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp.14 668–14 678.
[3] Z. Tian, C. Shen, H. Chen, and T. He, “Fcos: Fully convolutional one-stage object detection,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 9627–9636.
[4] P. Zhu, L. Wen, D. Du, X. Bian, H. Fan, Q. Hu, and H. Ling, “Detection and tracking meet drones challenge,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 11, pp. 7380–7399, 2021.
[5] D. Du, Y. Qi, H. Yu, Y. Yang, K. Duan, G. Li, W. Zhang, Q. Huang, and Q. Tian, “The unmanned aerial vehicle benchmark: Object detection and tracking,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 370–386.
[6] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
[7] R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.
[8] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in Computer Vision ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, 2016, pp.21–37.
[9] S. Haykin, “Kalman filters,” Kalman filtering and neural networks, pp.1–21, 2001.
[10] K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological cybernetics, vol. 36, no. 4, pp. 193–202, 19
[11] Z. Xi-yang, W. Xiao-li, and L. Liang-qun, “Online multi-object tracking via maximum entropy intuitionistic fuzzy data association,” in 2018 14th IEEE International Conference on Signal Processing (ICSP). IEEE, 2018, pp.03–806.
[12] Y. Huang, S. Y. Chong, and T. L. Song, “Track-to-track fusion using multiple detection linear multitarget integrated probabilistic data association.” in ICINCO, 2017, pp. 431–439.
[13] G. Chalvatzaki, X. S. Papageorgiou, C. S. Tzafestas, and P. Maragos, “Augmented human state estimation using interacting multiple model particle filters with probabilistic data association,” IEEE Robotics and Automation Letters, vol. 3, no. 3, pp. 1872–1879, 2018.
[14] Z. Chen et al., “Bayesian filtering: From kalman filters to particle filters, and beyond,” Statistics, vol. 182, no. 1, pp. 1–69, 2003.
[15] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 3–1
[16] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 618–626.
[17] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778
[18] H. Sheng, Y. Zhang, J. Chen, Z. Xiong, and J. Zhang, “Heterogeneous association graph fusion for target association in multiple object tracking,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 11, pp. 3269–3280, 2019.
[19] H. Nodehi and A. Shahbahrami, “Multi-metric re-identification for online multi-person tracking,” IEEE transactions on Circuits and Systems for Video Technology, vol. 32, no. 1, pp. 147–159, 2022.
[20] H. Luo, W. Jiang, Y. Gu, F. Liu, X. Liao, S. Lai, and J. Gu, “A strong baseline and batch normalization neck for deep person re-identification,” IEEE Transactions on Multimedia, vol. 22, no. 10, pp. 2597–2609, 2019
[21] E. Bochinski, V. Eiselein, and T. Sikora, “High-speed tracking-by-detection without using image information,” in 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, 2017, pp. 1–6.
[22] Z. Lu, V. Rathod, R. Votel, and J. Huang, “Retinatrack: Online single stage joint detection and tracking,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp.14 668–14 678.
[23] Z. Wang, L. Zheng, Y. Liu, Y. Li, and S. Wang, “Towards real-time multi-object tracking,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16. Springer, 2020, pp. 107–122.
[24] Y. Zhang, C. Wang, X. Wang, W. Zeng, and W. Liu, “Fairmot: On the fairness of detection and re-identification in multiple object tracking,” International Journal of Computer Vision, vol. 129, pp. 3069–3087, 2021
[25] X. Zhou, V. Koltun, and P. Kr ̈ahenb ̈uhl, “Tracking objects as points,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV. Springer, 2020, pp. 474–490.
[26] X. Zhou, D. Wang, and P. Kr ̈ahenb ̈uhl, “Objects as points,” arXiv preprint arXiv:1904.07850, 2019.
[27] P. Dendorfer, H. Rezatofighi, A. Milan, J. Shi, D. Cremers, I. Reid, S. Roth, K. Schindler, and L. Leal-Taix ́e, “Mot20: A benchmark for multi object tracking in crowded scenes,” arXiv preprint arXiv:2003.09003, 2020.
[28] S. Liu, X. Li, H. Lu, and Y. He, “Multi-object tracking meets moving uav,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 8876–8885.
[29] M. Yao, J. Wang, J. Peng, M. Chi, and C. Liu, “Folt: Fast multiple object tracking from uav-captured videos based on optical flow,” arXiv preprint arXiv:2308.07207, 2023.
[30] C. Xiao, Q. Cao, Y. Zhong, L. Lan, X. Zhang, H. Cai, and Z. Luo, “Enhancing online uav multi-object tracking with temporal context and spatial topological relationships,” Drones, vol. 7, no. 6, p. 389, 2023.
[31] J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y. Zhao, D. Liu, Y. Mu, M. Tan, X. Wang et al., “Deep high-resolution representation learning for visual recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 10, pp. 3349–3364, 2020.
[32] H. Liu, F. Liu, X. Fan, and D. Huang, “Polarized self-attention: Towards high-quality pixel-wise regression,” arXiv preprint arXiv:2107.00782, 2021.
[33] E. Bochinski, V. Eiselein, and T. Sikora, “High-speed tracking-by-detection without using image information,” in 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, 2017, pp. 1–6.
[34] H. Pirsiavash, D. Ramanan, and C. C. Fowlkes, “Globally-optimal greedy algorithms for tracking a variable number of objects,” in CVPR 2011. IEEE, 2011, pp. 1201–1208.
[35] S. Sun, N. Akhtar, H. Song, A. Mian, and M. Shah, “Deep affinity network for multiple object tracking,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 1, pp. 104–119, 2019.

全文公開日期 2025/12/05 (校內網路)
全文公開日期 2025/12/05 (校外網路)
全文公開日期 2025/12/05 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文