簡易檢索 / 詳目顯示

研究生: 曾筠筑
YUN-JHU ZENG
論文名稱: 利用熱紅外影像改善基於孿生網路的嵌入式目標追蹤系統
Improving Embedded Target Tracking Systems Based On Siamese Networks With Infrared Images
指導教授: 洪西進
Shi-Jinn Horng
口試委員: 李正吉
Cheng-Chi Lee
楊昌彪
Chang-Biau Yang
楊竹星
Chu-Sing Yang
林韋宏
Wei-Hong Lin
洪西進
Shi-Jinn Horng
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 50
中文關鍵詞: 目標追蹤熱紅外影像孿生追蹤網路多模態學習三軸伺服馬達嵌入式系統
外文關鍵詞: Target tracking, Thermal infrared image, Siamese tracking network, Multi-Modal Machine Learning, 3-axis servo motor, Embedded System
相關次數: 點閱:234下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 人類從遠古時代就存在著追蹤目標的本能,無論是為了狩獵獵物,抑或是現代的賽車、遊戲、戰爭,如何準確的追蹤目標是我們致勝的關鍵;而當我們有了機器輔助我們去追蹤目標,自然就可以解決更多的問題。隨著目標追蹤領域發展日益成熟,而目標追蹤被廣泛應用於軍事用途,例如戰機光達與飛彈;民用也百花齊放,涉及居家安全,賽事轉播,智慧居家等等多個領域。
    本論文旨在建立一套具實用性的三軸目標追蹤系統,透過模擬蛇類在自然界中的獵捕能力,我們增加熱紅外影像以增加模型模態,優化既有孿生網路追蹤目標的能力。我們結合多模態學習(Multi-Modal Machine Learning , MMML)領域的知識,將孿生追蹤網路修改為多模態融合模型,目的是使模型能更佳的融合普通影像與熱紅外影像的特徵,增加模型對於自然界的感知;同時將訓練普通影像與熱紅外影像的孿生模型分開,用偽孿生網路 (Pseudo-Siamese network) 做為主架構,分別處理不同的模態信息,解決資料量不足的問題。
    本論文成功透過多模態融合在多項測試中提升了模型穩定性與準確性。整體模型建立在孿生網路的架構之下,模型具備一定的實用性。透過嵌入式開發板與三軸伺服馬達,更加提高了系統在應用上的靈活性。


    Whether it is for hunting prey, or modern racing, games, or war, how to accurately track the target is the key to the victory. When we can have a machine to help us track the target, we can naturally solve more problems. The target tracking is becoming increasingly mature. Not only it can be used in military applications, such as warplane radar and missiles but it can be applied in civilian purposes including home security, event broadcasting, smart home, and many other areas.
    In this thesis, we aim to develop a practical three-axis target tracking system by simulating the hunting ability of snakes in nature, and we add thermal infrared images to increase the model modality and optimize the ability of existing Siamese tracking network. We combine our knowledge in the field of Multi-Modal Machine Learning (MMML) and modify the Siamese tracking network into a multi-modal fusion model, so as to improve the model's perception of the natural world by better integrating the features of normal images and thermal infrared images. At the same time, the Siamese networks models for training normal images and thermal infrared images are separated and a pseudo-Siamese network is used as the main framework to process the different modal information separately to solve the problem of insufficient data. This paper successfully improves the stability and accuracy of the model through multi-modal fusion model in several tests. With the embedded development version and three-axis servo motor, our system can be more flexible in application.

    摘要 I Abstract II 致謝 III 目錄 V 表目錄 VII 圖目錄 VIII 第一章 緒論 1 1.1 研究動機與目的 1 1.2 相關研究 2 1.3 論文架構與簡述 5 第二章 孿生網路應用於目標追蹤 6 2.1 孿生網路說明 6 2.2 孿生追蹤網路說明 8 2.3 任意目標追蹤問題 13 第三章 實驗方法與步驟 15 3.1 系統架構 15 3.2 硬體架構 16 3.3 實驗所用之資料集介紹 17 3.4 訓練全卷積孿生神經網路 18 3.5 多模態融合模型 23 第四章 實驗結果與討論 28 4.1 評估方法介紹 28 4.2 研究結果 29 4.3 實際應用 33 第五章 結論與未來展望 36 參考文獻 38

    [1] Gracheva EO, Ingolia NT,Kelly YM,Corder-Morales JF,Hollopeter G,Chesler AT, “Molecular basis of infrared detection by snakes,” Nature, pp. 464,1006-1011, 2010.
    [2] Yang H, Shao L, Zheng F, Wang L,Song Z, “Recent advances and trends in visual tracking: A review,” Neurocomputing, pp. 3823-3821, 2011.
    [3] Sun D Q, Roth S, Black M J, “Secrets of optical flow estimation and their principles,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2432-2439, 2010.
    [4] A. Doucet, “On sequential Monte Carlo methods for Bayesian filtering,” Intelligent Information Processing III, pp. 297-305, 2006.
    [5] Y. Cheng, “Mean shift, mode seeking, and clustering,” IEEE transactions on pattern analysis and machine intelligence17.8, pp. 790-799, 1995.
    [6] G. R. Bradski, “Computer vision face tracking for use in a perceptual user interface,” 1998.
    [7] Babenko B, Yang M H, Belongie S, “Robust object tracking with online multiple instance learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1619-1632, 2011.
    [8] Kalal Z, Mikolajczyk K, Matas J, “Tracking-learning-detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1409-1422, 2012.
    [9] Hare S, Saffari A, Torr P H S, “Struck: structured output tracking with kernels,” IEEE International Conference on Computer Vision, pp. 263-270, 2011.
    [10] Babenko B, Yang M H, Belongie S, “Visual tracking with online multiple instance learning,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 983-990, 2009.
    [11] Koch G, Zemel R, Salakhutdinov R, “Siamese neural networks for one-shot image recognition,” ICML deep learning workshop, 2015.
    [12] Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R, “Signature verification using a" siamese" time delay neural network,” Advances in neural information processing systems, pp. 737-744, 6 1993.
    [13] L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, and P. H. A. Torr, “Fully-convolutional Siamese networks for object tracking,” European Conference on Computer Vision, pp. 850-865, 9 2016.
    [14] Zagoruyko S, Komodakis N, “Learning to compare image patches via convolutional neural networks,” IEEE conference on computer vision and pattern recognition, pp. 4353-4361, 2015.
    [15] Jonathan Long, Evan Shelhamer, Trevor Darrell, “Fully convolutional networks for semantic segmentation,” IEEE conference on computer vision and pattern recognition, pp. 3431-3440, 2015.
    [16] Lenc K, Vedaldi A, “Understanding image representations by measuring their equivariance and equivalence,” IEEE conference on computer vision and pattern recognition, pp. 991-999, 2015.
    [17] Krizhevsky A, Sutskever I, Hinton G. E., “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, pp. 1097-1105, 2012.
    [18] S. Thorpe, D. Fize, C. Marlot, “Speed of processing in the human visual system,” Nature, pp. 381,520-522, 1996.
    [19] L. Fei-Fei, “Knowledge transfer in learning to recognize visual object classes,” International Conference on Development and Learning (ICDL), 2006.
    [20] L. Fei-Fei, R. Fergus , P. Perona, “One-Shot learning of object categories,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 594 - 611, 2006.
    [21] Henriques J. F, Caseiro R, Martins P, Batista, J, “High-speed tracking with kernelized correlation filters,” IEEE transactions on pattern analysis and machine intelligence, pp. 583-596, 2014.
    [22] Xing E. P , Ng A. Y, Jordan M. I, Russell S, “Distance metric learning, with application to clustering with side-information,” Advances in Neural Information Processing Systems, p. 505–512, 2002.
    [23] Huang L, Zhao X, Huang K, “Got-10k: A large high-diversity benchmark for generic object tracking in the wild,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019.
    [24] Qiao Liu, Xin Li, Zhenyu He, Chenglong Li, Jun Li, Zikun Zhou, Di Yuan, Jing Li, Kai Yang, Nana Fan, Feng Zheng, “LSOTB-TIR: A large-scale high-Diversity thermal infrared object tracking benchmark,” ACM International Conference on Multimedia, p. 3847–3856, 2020.
    [25] Kristan M, Matas J, Leonardis A, Vojíř T, Pflugfelder R, Fernandez G, ... Čehovin L., “A novel performance evaluation methodology for single-target trackers,” IEEE transactions on pattern analysis and machine intelligence, pp. 2137-2155, 2016.
    [26] L. Zhang, A. Gonzalez-Garcia1, J. van de Weijer, M. Danelljan, F. S. Khan, “Learning the model update for Siamese trackers,” IEEE International Conference on Computer Vision, pp. 4010-4019, 2019.
    [27] Lin T. Y, Goyal P, Girshick R, He K, Dollár P, “Focal loss for dense object detection,” IEEE international conference on computer vision, pp. 2980-2988, 2017.
    [28] McFee B, Lanckriet G, Jebara T, “Learning multi-modal similarity,” Journal of machine learning research, p. 2011.

    無法下載圖示 全文公開日期 2023/09/18 (校內網路)
    全文公開日期 2026/09/18 (校外網路)
    全文公開日期 2026/09/18 (國家圖書館:臺灣博碩士論文系統)
    QR CODE