低複雜度的嵌入式熱像物件偵測系統設計｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	程韋嘉 Wei-Jia Cheng
論文名稱：	低複雜度的嵌入式熱像物件偵測系統設計 Low-Complexity Embedded Thermal Imaging Object Detection System Design
指導教授：	陳永耀 Yung-Yao Chen
口試委員:	陳永耀 Yung-Yao Chen 陳維美 Wei-Mei Chen 林淵翔 Yuan-Hsiang Lin 阮聖彰 Shanq-Jang Ruan 李佩君 Pei-Jun Lee
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2023
畢業學年度：	111
語文別：	中文
論文頁數：	52
中文關鍵詞：	嵌入式系統、熱影像物件偵測、低複雜度
外文關鍵詞：	Infrared Image, Low-Complexity, embedding system
相關次數：	點閱：200 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近年來隨著目前智慧領域的蓬勃發展，自動駕駛領域的行車安全越來越被受到重視，物件偵測技術亦是被自動駕駛領域廣泛應用的一項技術，人們大多搭配RGB相機、熱影像相機、光達等車輛感測器來檢測和識別道路上的各種物體，並透過識別這些物體，來使自動駕駛系統可以做出適當的決策和行動。因此，為了提升行車安全，很多人採用RGB相機進行物件偵測，但由於RGB相機會因為夜間拍攝場景過暗、光害造成影像過度曝光、或是雨水滴到鏡頭等問題造成模型辨識困難，導致在特定場景當中的辨識效果可謂非常糟糕。因此，為了開發在大部分場景有著穩定的辨識率的物件偵測模型，本篇論文基於熱影像物件偵測技術進行開發，最後，會將模型實現於嵌入式開發板上，以符合車用電子的運算效能，本論文亦提出了一個低複雜度運算之神經網路架構。來解決模型過大導致運算效能低落的情況。

In recent years, with the rapid growth in the field of artificial intelligence, the importance of driving safety in the autonomous driving has been increasingly recognized. Object detection is widely applied in the field of autonomous driving, where people often use vehicle sensors such as RGB cameras, thermal cameras, and LiDAR to detect and identify various objects on the street. By detecting these objects, the autonomous driving system can make appropriate decisions and actions.
To enhance driving safety, many people use RGB cameras for object detection. However, RGB cameras face challenges such as dark scene, image overexposure, or raindrops on the RGB camera lens, which make object detection difficult and result in poor performance in specific scenarios.
Therefore, to develop an object detection model with a stable accuracy in most scenarios, this thesis is based on thermal imaging object detection. The model will be implemented on an embedded evaluation board to meet the computational requirements of automotive electronics. In this thesis, a neural network architecture with low computational complexity is presented as a solution to the problem of decreased computational efficiency resulting from complex models.

指導教授推薦書    i
考試委員審定書    ii
致謝    iii
摘要    iv
Abstract    v
目錄    vi
圖目錄    ix
表目錄    x
第一章 緒論    1
1.1. 前言    1
1.2. 研究動機    2
1.3. 論文貢獻    3
第二章 相關文獻    4
2.1. RGB-Based 物件偵測模型    4
2.1.1 One-stage learning    4
2.1.2 Two-stage learning    4
2.2 熱影像物件偵測探討(含熱影像故障偵測)    5
2.3 Yolo(You Only Look Once)系列之物件偵測方法    6
2.4 輕量化模型架構介紹    7
2.4.1 ONNX模型架構    8
2.4.2 Tensorflow Lite    8
第三章 方法    10
3.1 系統流程介紹    10
3.2 熱影像物件辨識系統    13
3.2.1 骨幹架構    13
3.2.2 Neck SPP+PAN    14
3.2.3 Auto Learning Bounding Box Anchors    17
3.2.4 深度可分離網路    17
3.2.5 注意力機制    20
3.3 熱影像模型訓練細節及轉換過程    22
3.4 模型低複雜度化之方法    24
3.4.1 訓練後量化    24
3.4.2 量化感知訓練    25
3.5 損失函數    25
3.6 資料擴增技術    27
3.7 硬體及開發版相關資訊    28
第四章 實驗結果與分析    30
4.1 實驗環境    30
4.2 資料集    31
4.3 模型參數設置    32
4.4 效能評估及消融實驗    33
4.5 實驗結果    36
第五章 結論    38
參考文獻    39

                                

[1] J. Redmon, S. Divvala, R. B. Girshick and A. Farhadi, “You only look once: Unified, real-time object detection,” 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp. 779–788, 2016
[2] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, A. C. Berg, “SSD: Single Shot MultiBox Detector,” arXiv: 1512.02325 2015
[3] R.B. Girshick, “Fast R-CNN,” 2015, IEEE International Conference on Computer Vision (ICCV), pp. 1440-1448.
[4] K. He, G. Gkioxari, P. Dollár and R. Girshick, “Mask R-CNN,” 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980-2988.
[5] G. Batchuluun, J. K. Kang, D. T. Nguyen, T. D. Pham, M. Arsalan and K. R. Park, “Deep Learning-Based Thermal Image Reconstruction and Object Detection,” in IEEE Access, vol. 9, pp. 5951-5971, 2021.
[6] B. Wang, M. Dong, M. Ren, Z. Wu, C. Guo, T. Zhuang, O. Pischler, J. Xie, “Automatic Fault Diagnosis of Infrared Insulator Images Based on Image Instance Segmentation and Temperature Analysis,” in IEEE Transactions on Instrumentation and Measurement, vol. 69, no. 8, pp. 5345-5355, Aug. 2020.
[7] P. Qin, Y. Liu, G. Zhou, C. Tang and J. Zhang, “Dual-stream infrared object detection network combined with saliency images,” in 2022 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC), Dalian, China, 2022, pp. 428-432.
[8] J. Liu, Q. Hou, M. Cheng, J. Feng, J. Jiang, “A Simple Pooling-Based Design for Real-Time Salient Object Detection” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 3912-3921.
[9] J. Redmon and A. Farhadi, "YOLO9000: better, faster, stronger," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6517-6525.
[10] J. Redmon, A. Farhadi, “YOLOv3: An Incremental Improvement,” arXiv: 1804.02767, 2018.
[11] T. Y. Lin, P. Dollár, R. B. Girshick, K. He, B. Hariharan, S. J. Belongie, “Feature pyramid networks for object detection,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 pp. 936-944.
[12] J. Glenn. YOLOv5 release v5.0. https://github.com/ultralytics/yolov5/releases/tag/ v5.0, 2022.
[13] C. Y. Wang, H. Liao, I. H. Yeh, Y. H. Wu, P. Y. Chen and J. W. Hsieh, “Cspnet: A new backbone that can enhance learning capability of cnn,” 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 390–391.
[14] A. Bochkovskiy, C. Y. Wang, H. Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” arXiv: 2004.10934, 2020.
[15] S. Liu, L. Qi, H. Qin, J. Shi and J. Jia, "Path Aggregation Network for Instance Segmentation,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 8759-8768.
[16] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” arXiv: 1704.04861, 2017.
[17] S. Woo, J. Park, J. Y. Lee and I. S. Kweon, “CBAM: Convolutional Block Attention Module,” European conference on computer vision (EVVC), 2018, pp. 3-19.
[18] S. Li, Y. Li, Y. Li, M. Li and X. Xu, “YOLO-FIRI: Improved YOLOv5 for Infrared Image Object Detection," in IEEE Access, vol. 9, pp. 141861-141875, 2021.
[19] Vanhoucke, Vincent, A. Senior, and M. Z. Mao. “Improving the speed of neural networks on CPUs,” Neural Information Processing Systems (NIPS) 2011.
[20] S. Gupta, A. Agrawal, K. Gopalakrishnan, P. Narayanan, “Deep Learning with Limited Numerical Precision,” in Proceedings of Machine Learning Research (PMLR), vol. 37, pp. 1737-1746, 2015.
[21] H. Zhang, M. Cisse, Y. N. Dauphin, D. L. Paz, “mixup: Beyond Empirical Risk Minimization,” arXiv: 1710.09412, 2017.
[22] S. Hwang, J. Park, N. Kim, Y. Choi and I. S. Kweon, “Multispectral pedestrian detection: Benchmark dataset and baseline,” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1037-1045.
[23] K. Zhou, L. Chen, X. Cao, “Improving Multispectral Pedestrian Detection by Addressing Modality Imbalance Problems” 2020 European Conference on Computer Vision (ECCV), 2020, pp. 213-229.
[24] C. Y. Wang, A. Bochkovskiy, H. Y. M. Liao, “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” arXiv: 2207.02696, 2022.

全文公開日期 2025/08/17 (校內網路)
全文公開日期 2025/08/17 (校外網路)
全文公開日期 2025/08/17 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文