簡易檢索 / 詳目顯示

研究生: 許凱鈞
Kai-June Xu
論文名稱: 基於U-Net架構和圖像融合的熱影像分割
Thermal Image Segmentation Based on U-Net Architecture and Image Fusion
指導教授: 孫沛立
Pei-Li Sun
陳怡永
Yi-Yung Chen
口試委員: 林宗翰
Tzung-Han Lin
學位類別: 碩士
Master
系所名稱: 應用科技學院 - 色彩與照明科技研究所
Graduate Institute of Color and Illumination Technology
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 88
中文關鍵詞: 影像分割熱成像深度學習卷積神經網路U-Net影像融合
外文關鍵詞: Image segmentation, thermal imaging, deep learning, convolutional neural networks, U-Net, image fusion
相關次數: 點閱:269下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

近年來,熱像儀被廣泛應用於智慧監控系統、生物辨識、健康監測等各個領域。除了辨識熱源位置、檢測溫度之外,也可進一步用於熱源的物件辨識與物件分割。然而,熱像儀的製造成本遠高於一般相機,高階的熱像儀雖然解像力高,量測數據穩定,但價格昂貴,無法普及;中低階的熱像儀解析度低,無法滿足影像辨識的需求。因此,如何提升中低階熱像儀的物件辨識與分割的準確度是值得研究的議題。
由於中低階熱像儀的解析度不足以及熱擴散的影響,熱影像無法清晰呈現熱源物件的外觀,這會嚴重影響熱源物件辨識與分割的準確度。為了解決這個問題,有學者提出了融合紅外線熱影像及可見光影像以提高物件分割準確度的雙頻譜技術。本研究基於上述概念,提出一種以U-Net 作為骨幹架構的優化卷積神經網路模型,利用紅外線熱影像及受干擾環境下拍攝的可見光影像作為數據集,將這兩個頻譜的影像資訊透過卷積神經網路模型融合,以提高熱源物件影像分割的準確度。
本研究首先探討了影像擴增、三種損失函式、三種訓練批量、四種學習率、四種影像環境對分割準確度的影響。本研究使用室內與戶外,兩種不同拍攝環境的影像數據集,經過模擬受干擾環境、數據集擴增等影像前處理後,在基於U-Net的卷積神經網路模型中訓練。在訓練網路模型時,透過不同參數的設定與比較,選擇出優化的訓練參數。實驗結果顯示,室內與戶外拍攝的影像數據集的mIoU準確率分別在73 %與75%左右,表現略優於過往文獻的熱影像分割模型。
本研究接著比較了不同解析度的紅外線熱影像對熱源物件分割準確度的影響。單獨使用紅外線熱影像,以及單獨使用可見光影像的單頻譜網路分割效果也一併進行比較。實驗結果顯示,雙頻譜影像融合網路架構在熱影像分割準確度上明顯優於單頻譜的網路架構。


In recent years, thermal-imaging cameras have been widely used in various fields such as smart monitoring systems, biometrics, and health monitoring. In addition to identifying the location of the heat source and detecting the temperature, it can also be further used for recognition and segmentation of the heat source. However, the manufacturing cost of thermal-imaging cameras is much higher than ordinary cameras, and high-end thermal-imaging cameras are expensive and cannot be popularized. Low-end thermal-imaging cameras have low resolution and cannot meet the needs of image recognition. Therefore, how to improve the accuracy of object recognition and object segmentation of the low-end thermal-imaging cameras is a topic worth studying.
Due to the insufficient resolution of the low-end thermal-imaging cameras and the influence of thermal diffusion, thermal images cannot clearly show the image features of heat source objects. This will seriously affect the accuracy of heat source object recognition and segmentation. In order to solve this problem, some researches have proposed methods of fusing thermal image and visible light image information to improve the accuracy of image segmentation. Based on the above concepts, this study proposes a convolutional neural network model that uses U-Net as the backbone and optimizes its architecture. Create a supervised learning environment, use thermal imager images and visible light images taken in disturbed environments as training sets, and integrate the two kinds of image information obtained by a thermal imager and a visible light camera through modified U-Net to improve the accuracy of heat object segmentation.
Firstly, this study explores the effects of image augmentation, three loss functions, three training batches, four learning rates, and four image environments on the results. In this study, two image data sets from different shooting environments, indoor and outdoor, were used for model training and testing. The experimental results show that the mIoU accuracy rate of the heat source object recognition of the indoor image dataset is around 70 % to 75 %, the performance is slightly better than the thermal image segmentation models proposed in the previous literature.
Secondly, this study compared the impact of the fusion of mid-range or low-end thermal imager with high-resolution visible imager on the mIoU accuracy rate. Single-spectrum models which used thermal image alone or visible image alone were also compared. The experimental results show that the dual-spectrum image fusion network architecture is significantly better than the single-spectrum network architecture in thermal image segmentation.

論文摘要 II ABSTRACT III 誌謝 V 目錄 VI 圖目錄 IX 表目錄 XII 第一章 緒論 1 1.1 研究背景 1 1.2 研究目的 2 1.3 研究限制 3 1.4 論文架構 5 第二章 文獻探討 6 2.1 多光譜影像 6 2.1.1 灰階可見光影像 6 2.1.2 紅外線與熱影像成像 8 2.2 影像分割 11 2.2.2 基於閾值分割 11 2.2.3 邊緣檢測 12 2.2.4 聚類與超像素分割 12 2.2.5 圖割法分割 13 2.2.6 形態學分水嶺 13 2.3 用於影像分割的神經網路 14 2.3.1 U-Net 14 2.3.2 殘差神經網路 15 2.4 影像資訊融合 16 2.4.2神經網路式影像資訊融合 18 2.5 影像融合影像分割 20 2.5.1 RTFNet 20 2.5.2 FuseSeg Net 21 2.5.3 Lwbna_ U-Net 22 第三章 研究方法 23 3.1 實驗設備 24 3.1.1 硬體設備 24 3.1.2 軟體設備 26 3.2 數據集與處理 27 3.2.1 拍攝環境定義 27 3.2.2 數據集影像對齊 29 3.2.3 數據集影像前處理 30 3.2.4 數據集訓練分配 35 3.3 神經網路模型 35 3.3.1 網路模型架構 35 3.3.3.1紅外線熱影像降採樣卷積層架構 36 3.3.3.2可見光影像降採樣卷積層架構 37 3.3.3.3神經網路升採樣卷積層架構 38 3.3.2 注意力模組 39 第四章 實驗結果 40 4.1 實驗結果比較指標 40 4.1.1 混淆矩陣 40 4.1.2 平均交並比 41 4.1.3 F值 42 4.2 參數優化 43 4.2.1 損失函數 43 4.2.2 學習率 47 4.2.3 批量 49 4.3 模型訓練時間 51 4.4 訓練結果比較 52 4.5 影像頻帶選擇對熱影像分割準確度的影響 58 第五章 結論與建議 68 5.1 結論 68 5.2 建議 70 參考文獻 71

[1] B. B. Lahiri, S. Bagavathiappan, T. Jayakumar, John Philip, “Medical applications of infrared thermography: A review”, Infrared Physics & Technology, pp. 221-235, 2012.
[2] J. C. Jimenez-Munoz and J. A. Sobrino, “Split-window coefficients for land surface temperature retrieval from low-resolution thermal infrared sensors”, in IEEE Geoscience and Remote Sensing Letters, pp. 806-809, 2008.
[3] J Ring EF, K. Ammer, “Infrared thermal imaging in medicine”, Physiol Meas., 2012. doi: 10.1088/0967-3334/33/3/R33.
[4] M. I. Rochman et al., “Impact of device thermal performance on 5G mmwave communication systems”, 2022 IEEE International Workshop Technical Committee on Communications Quality and Reliability (CQR), Arlington, VA, USA, 2022.
[5] Z. Tu, Y. Ma, Z. Li, C. Li, J. Xu, and Y. Liu, “RGBT salient object detection: A large-scale dataset and benchmark”, IEEE Trans. Multimed., 2022.
[6] C. Starr, “Biology: Concepts and Applications,” Thomson Brooks/Cole. 2005
[7] 呂鳳軍,數字與圖像處理編程入門 ,清華大學出版社, 1999。
[8] S. Johnson, “Stephen Johnson on Digital Photography,” O'Reilly, 2006.
[9] C. Poynton, “Digital Video and HD: Algorithms and Interfaces,” Morgan Kaufmann, 2012.
[10] D. Baroletab, F. Christiaens, M. R. Hamblin. “Infrared and skin: Friend or foe,” e3 Photochem Photobiol B., 155:78-85, 2016. doi: 10.1016/j.jphotobiol.2015.12.014.

[11] A.M.C. Davies, “An introduction to near-infrared (NIR) spectroscopy IM Publications Open”. www.impopen.com. 2022-06-01.
[12] Jason Whetstone. “RGB-IR: Daytime and nighttime imaging from a single sensor”, https://www.onsemi.com/company/news-media/blog/innovation-forum/daytime-and-nighttime-imaging-from-a-single-sensor (2023 年 7月 18 日查詢。)
[13] M. Kanti Bhowmik et al., “Thermal infrared face recognition – A biometric identification technique for robust security system”, Reviews, refinements and new ideas in face recognition. inTech., 2011.
[14] 鐘國亮,影像處理與電腦視覺,第六版,東華書局出版, 2015。
[15] R. C. Gonzalez, R. E. Woods. “Digital Image Processing”, Prentice Hall, 2018.
[16] N. Otsu, "A threshold selection method from gray-level histograms," in IEEE Transactions on Systems, Man, and Cybernetics, Jan. 1979.
[17] R. Szeliski, “Computer Vision: Algorithms and Applications”, Springer, 2021.
[18] Y. Mo, Y. Wu, X. Yang, F. Liu, and Y. Liao, “Review the state-of-the-art technologies of semantic segmentation based on deep learning”, Neurocomputing, 493, pp. 626–646, 2022, doi: https://doi.org/10.1016/j.neucom.2022.01.005.
[19] O. Ronneberger, P. Fischer, T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation”, Medical Image Computing and Computer-Assisted Intervention (MICCAI) , LNCS , pp. 234-241, 2015.
[20] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition." in 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings. 2015.
[21] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, CVPR.2016.9.
[22] Y. J. Zhang, “Image Engineering: Processing, Analysis, and Understanding”, Cengage, 2009.
[23] Y. Sun, W. Zuo, M. Liu, and S. Member, “RTFNet: RGB-thermal fusion network for semantic segmentation of urban scenes,” IEEE Robot. Autom. Lett., 2019.
[24] L.Z. Chen, Z. Lin, Z. Wang, Y.L. Yang, and M.M. Cheng, “Spatial information guided convolution for real-time RGBD semantic segmentation,” TIP, 2021.
[25] Y. Sun, W. Zuo, P. Yun, H. Wang, and M. Liu, “FuseSeg: Semantic segmentation of urban scenes based on RGB and thermal data fusion,” IEEE Trans. Autom. Sci. Eng., 2021.
[26] X. Hu, K. Yang, L. Fei, and K. Wang, “ACNet: Attention based network to exploit complementary features for RGBD semantic segmentation,” in ICIP, 2019.
[27] P. Sharma, T. Ninomiya, K. Omodaka, and N. Takahashi, “OPEN A lightweight deep learning model for automatic segmentation and analysis of ophthalmic images”, 2022.
[28] HTC desire 21 pro, https://www.htc.com/tw/smartphones/htc-desire-21-pro-5g/ ( 2022 年 10月 13 日查詢。)
[29] P. Wang, Z. Xie, H. Meng and Z. Hu, “Effects of the temperature and roughness on the metal emissivity,” The 27th Chinese Control and Decision Conference (2015 CCDC), pp. 6197-6200, 2015. doi: 10.1109/CCDC.2015.7161926.
[30] " opencv/resize "(網址更新2020 年 10月 30 日。2022 年 11月 8 日查詢。) https://docs.opencv.org/3.4/da/d54/group__imgproc__transform.html
[31] " opencv/image filtering "(網址更新2021 年 10月 30 日。2022 年 11月 08 日查詢。)https://docs.opencv.org/3.4/d4/d86/group__imgproc__filter.html
[32] " Python + Opencv影像增強 "(網址更新2019 年 5月 19 日。2022 年 1月 04 日查詢。)https://openhome.cc/Gossip/DCHardWay/ImageNoise.html
[33] H. Wen, X. Zhou, “Semantic segmentation recognition model for tornado-induced building damage based on satellite images”, Journal of Building Engineering, 2022
[34] Ma Yi-de, Liu Qing, and Qian Zhi-Bai. Automated image segmentation using improved CNN model based on cross-entropy. In Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, pp.743–746. IEEE, 2004.
[35] V. Pihur, S. Datta, and S. Datta. “Weighted rank aggregation of cluster validation measures: a monte carlo cross-entropy approach”, Bioinformatics, 23(13):1607–1615, 2007.
[36] T. Y. Lin, P. Goyal, R. Girshick, K. He and P. Dollár, "Focal Loss for Dense Object Detection," 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2999-3007, 2017. doi: 10.1109/ICCV.2017.
[37] L. N. Smith, “Cyclical learning rates for training neural networks”, 2017 conference on Computer Vision and Pattern Recognition (CVPR), 2017

QR CODE