簡易檢索 / 詳目顯示

研究生: 蔡峻宏
JUN-HONG CAI
論文名稱: 基於可解釋性分析用於深度學習模型之紡織線圈瑕疵檢測
Interpretability Analysis in Deep Learning Model for Textile Coil Defect Detection
指導教授: 蘇順豐
Shun-Feng Su
口試委員: 蘇順豐
Shun-Feng Su
姚立德
Leeh-Ter Yao
莊鎮嘉
Chen-Chia Chuang
王偉彥
Wei-Yen Wang
王乃堅
Nai-Jian Wang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 75
中文關鍵詞: 殘差網路目標檢測影像分類注意力機制可解釋AI瑕疵檢測
外文關鍵詞: Residual Networks, Object Detection, Image Classification, Attention, Explainable AI, Defect Detection
相關次數: 點閱:263下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著科技的發展,產品良率和質量備受關注,產品的瑕疵檢測已成為生產過程中不可或缺的重要角色。本研究使用基於網格分類和目標檢測方法來對影像中的紡織線圈進行瑕疵檢測。在本研究的產品影片由台灣化學纖維股份有限公司所提供,其拍攝檢測生產線上紡織線圈表面上的瑕疵。產品影片在自行轉換成模型所需的訓練數據和測試數據。在本研究中,第一種方法採用基於網格分類,是將原始影像切割成64×64的網格影像進行訓練和測試。殘差分散注意力機制(ResNeSt)為此方法的基礎模型,為了降低誤判率和提升準確率,本研究使用Grad-CAM來分析模型,從熱度圖關注的區域分析,從而修改模型架構,並將卷積注意力機制(CBAM)加入模型得到最終的模型,讓最終模型適合本研究中的數據集。本研究所提出的模型在測試數據集中可以達到0.893 Precision和0.892 Recall的效果。這裡證明我們所提出的模型架構相較於基礎模型擁有更好的效果。第二種方法是目標檢測,使用YOLOv4,是將原始影像切割成1080×1080的影像進行訓練和測試。YOLOv4在測試數據集中可以達到0.92 Precision和0.76 Recall的效果。我們所提出的模型架構對比YOLOv4各有優缺點,只是本研究所提出的模型預測時間花費高於YOLOv4,時間花費是YOLOv4兩倍以上。


    In this study, both grid-based classification and object detection methods are considered in the image-based detection of defects for textile coils. Actual videos of Product are converted into training and testing data. In this study, the first method adopts grid-based classification. The idea is to divide the original image into 64×64 grid images for training and testing. The Residual Split-Attention Networks (ResNeSt) is the basic model for grid-based classification. In order to reduce recognition errors and improve the accuracy rate, this study uses Grad-CAM to analyze the learning behaviors by considering the attention area from the heat map, and thereby to modify the model architecture based on finds of those analysis. In addition to the structure change, the final model is obtained by also adding the convolutional attention mechanism (CBAM) to the model. From our experiments, it can be observed that the final model is suitable for the dataset of this study. The model proposed in this study can achieve 0.893 Precision and 0.892 Recall in the test data set. It is evident that the proposed model architecture outperforms the base model. The second method is directly object (defects) recognition, which uses YOLOv4 on the original image, which is cut into 1080×1080 images for training and testing. YOLOv4 can achieve 0.92 Precision and 0.76 Recall in the test data set. Compared the propose grid-based model with YOLOv4, the grid-based model architecture proposed has its own advantages and disadvantages, but the prediction time of the grid-based model proposed is longer than that of using YOLOv4, and the time spent is twice more than that of using YOLOv4.

    中文摘要 I Abstract II 致謝 III 目錄 IV 圖目錄 VII 表目錄 XI 第一章 介紹 1 1.1 背景 1 1.2 動機和數據介紹 2 1.3 論文貢獻 4 1.4 論文組織 4 第二章 相關文獻 5 2.1 卷積神經網路(Convolutional Neural Networks,CNN) 5 2.2 影像分類 5 2.3 目標檢測 6 2.4 可解釋AI 7 第三章 影像前處理 8 3.1 幀差法 8 3.2 數學形態學 9 3.2.1 侵蝕 9 3.2.2 膨脹 9 3.3 霍夫橢圓檢測 10 3.4 資料擴增 11 第四章 研究方法 12 4.1 結構 12 4.1.1 基於網格分類 13 4.1.2 目標檢測 13 4.2 數據準備 14 4.2.1 基於網格分類 14 4.2.2 目標檢測 16 4.3 模型架構 17 4.3.1 ResNet 17 4.3.2 ResNet18 + CBAM 19 4.3.2.1 通道注意力機制 20 4.3.2.2 空間注意力機制 20 4.3.3 ResNeSt 21 4.4 修改和重新設計模型架構 24 4.4.1 特徵圖可視化 24 4.4.2 Grad-CAM 25 4.4.3 自定義模型 27 4.5 YOLOv4 30 4.5.1 CSPDarknet53 31 4.5.2 SPP (Spatial Pyramid Pooling layer) and PAN(Path Aggregation Network) 32 4.6 Loss Functions 33 第五章 實驗 35 5.1 環境 35 5.1.1 訓練環境 35 5.1.2 測試環境 35 5.1.3 軟體環境 36 5.1.4 攝影機環境 36 5.2 數據集 37 5.2.1 基於網格的分類 37 5.2.2 目標檢測 39 5.3 Evaluation Metric 40 5.4 實驗 40 5.4.1 基於網格的分類 40 5.4.1.1 實施細節 41 5.4.1.2 結果和分析 41 5.4.2 目標檢測 48 5.4.2.1 實施細節 48 5.4.2.2 結果和分析 48 5.5 實際測試結果和分析 49 5.5.1 基於網格分類 50 5.5.2 目標檢測 51 第六章 結論和未來工作 53 6.1 結論 53 6.2 未來工作 53 6.2.1 基於網格的分類 53 6.2.2 目標檢測 54 文獻 55

    [1] J. Luo, Z. Yang, S. Li, and Y. Wu, "FPCB Surface Defect Detection: A Decoupled Two-Stage Object Detection Framework," IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1-11, 2021, doi: 10.1109/TIM.2021.3092510.
    [2] S. Guan et al., "Ceramic ring defect detection based on improved YOLOv5," in 2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA), 20-22 May 2022 2022, pp. 115-118, doi: 10.1109/CVIDLICCEA56201.2022.9824099.
    [3] Z. Li, J. Zhang, T. Zhuang, and Q. Wang, "Metal surface defect detection based on MATLAB," in 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), 12-14 Oct. 2018 2018, pp. 2365-2371, doi: 10.1109/IAEAC.2018.8577540.
    [4] D. Wang and H. Liu, "Edge detection of cord fabric defects image based on an improved morphological erosion detection methods," in 2010 Sixth International Conference on Natural Computation, 10-12 Aug. 2010 2010, vol. 8, pp. 3943-3947, doi: 10.1109/ICNC.2010.5584778.
    [5] Y. Cheng, D. HongGui, and F. YuXin, "Effects of Faster Region-based Convolutional Neural Network on the Detection Efficiency of Rail Defects under Machine Vision," in 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC), 12-14 June 2020 2020, pp. 1377-1380, doi: 10.1109/ITOEC49072.2020.9141787.
    [6] H. Zhang et al., "ResNeSt: Split-Attention Networks," arXiv e-prints, p. arXiv:2004.08955, 2020.
    [7] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, "Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization," arXiv e-prints, p. arXiv:1610.02391, 2016.
    [8] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, "CBAM: Convolutional Block Attention Module," arXiv e-prints, p. arXiv:1807.06521, 2018.
    [9] 劉昱暐, "基於深度學習之紡織線圈瑕疵檢測探討," 碩士論文, 國立臺灣科技大學機械工程系, 台北市, 2020.
    [10] K. Fukushima, "Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position," Biological Cybernetics, vol. 36, no. 4, pp. 193-202, 1980/04/01 1980, doi: 10.1007/BF00344251.
    [11] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," arXiv e-prints, p. arXiv:1512.03385, 2015.
    [12] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," arXiv e-prints, p. arXiv:1409.1556, 2014.
    [13] C. Szegedy et al., "Going deeper with convolutions," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 7-12 June 2015 2015, pp. 1-9, doi: 10.1109/CVPR.2015.7298594.
    [14] S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, "Aggregated Residual Transformations for Deep Neural Networks," arXiv e-prints, p. arXiv:1611.05431, 2016.
    [15] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," arXiv e-prints, p. arXiv:1506.02640, 2015.
    [16] C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors," arXiv e-prints, p. arXiv:2207.02696, 2022.
    [17] W. Liu et al., "SSD: Single Shot MultiBox Detector," arXiv e-prints, p. arXiv:1512.02325, 2015.
    [18] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, "Focal Loss for Dense Object Detection," arXiv e-prints, p. arXiv:1708.02002, 2017.
    [19] K. He, X. Zhang, S. Ren, and J. Sun, "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition," arXiv e-prints, p. arXiv:1406.4729, 2014.
    [20] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," arXiv e-prints, p. arXiv:1506.01497, 2015.
    [21] R. L. Draelos and L. Carin, "Use HiResCAM instead of Grad-CAM for faithful explanations of convolutional neural networks," arXiv e-prints, p. arXiv:2011.08891, 2020.
    [22] A. Chattopadhyay, A. Sarkar, P. Howlader, and V. N. Balasubramanian, "Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks," arXiv e-prints, p. arXiv:1710.11063, 2017.
    [23] R. Fu, Q. Hu, X. Dong, Y. Guo, Y. Gao, and B. Li, "Axiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs," arXiv e-prints, p. arXiv:2008.02312, 2020.
    [24] H. Wang et al., "Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks," arXiv e-prints, p. arXiv:1910.01279, 2019.
    [25] S. Desai and H. G. Ramaswamy, "Ablation-CAM: Visual Explanations for Deep Convolutional Network via Gradient-free Localization," in 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), 1-5 March 2020 2020, pp. 972-980, doi: 10.1109/WACV45572.2020.9093360.
    [26] M. Bany Muhammad and M. Yeasin, "Eigen-CAM: Class Activation Map using Principal Components," arXiv e-prints, p. arXiv:2008.00299, 2020.
    [27] Y. Zhang, X. Wang, and B. Qu, "Three-Frame Difference Algorithm Research Based on Mathematical Morphology," Procedia Engineering, vol. 29, pp. 2705-2709, 12/31 2012, doi: 10.1016/j.proeng.2012.01.376.
    [28] N. Singla, "Motion Detection Based on Frame Difference Method," 2014.
    [29] G. Matheron and J. Serra, "The birth of mathematical morphology," in Proc. 6th Intl. Symp. Mathematical Morphology, 2002: Sydney, Australia, pp. 1-16.
    [30] R. A. McLaughlin, "Randomized Hough Transform: Improved ellipse detection with comparison1Electronic Annexes available. See http://www.elsevier.nl/locate/patrec.1," Pattern Recognition Letters, vol. 19, no. 3, pp. 299-305, 1998/03/01/ 1998, doi: https://doi.org/10.1016/S0167-8655(98)00010-5.
    [31] F. J. Moreno-Barea, F. Strazzera, J. M. Jerez, D. Urda, and L. Franco, "Forward Noise Adjustment Scheme for Data Augmentation," in 2018 IEEE Symposium Series on Computational Intelligence (SSCI), 18-21 Nov. 2018 2018, pp. 728-734, doi: 10.1109/SSCI.2018.8628917.
    [32] S. Yang, W. Xiao, M. Zhang, S. Guo, J. Zhao, and F. Shen, "Image Data Augmentation for Deep Learning: A Survey," arXiv e-prints, p. arXiv:2204.08610, 2022.
    [33] X. Wang and Z. Hu, "Grid-based pavement crack analysis using deep learning," in 2017 4th International Conference on Transportation Information and Safety (ICTIS), 8-10 Aug. 2017 2017, pp. 917-924, doi: 10.1109/ICTIS.2017.8047878.
    [34] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection," arXiv e-prints, p. arXiv:2004.10934, 2020.
    [35] C. Y. Wang, H. Y. M. Liao, Y. H. Wu, P. Y. Chen, J. W. Hsieh, and I. H. Yeh, "CSPNet: A New Backbone that can Enhance Learning Capability of CNN," in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 14-19 June 2020 2020, pp. 1571-1580, doi: 10.1109/CVPRW50498.2020.00203.
    [36] K. He, X. Zhang, S. Ren, and J. Sun, "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1904-1916, 2015, doi: 10.1109/TPAMI.2015.2389824.
    [37] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, "Path Aggregation Network for Instance Segmentation," in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18-23 June 2018 2018, pp. 8759-8768, doi: 10.1109/CVPR.2018.00913.
    [38] J. Redmon and A. Farhadi, "YOLOv3: An Incremental Improvement," arXiv e-prints, p. arXiv:1804.02767, 2018.
    [39] T. He, Z. Zhang, H. Zhang, Z. Zhang, J. Xie, and M. Li, "Bag of Tricks for Image Classification with Convolutional Neural Networks," in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 15-20 June 2019 2019, pp. 558-567, doi: 10.1109/CVPR.2019.00065.
    [40] J. Hu, L. Shen, S. Albanie, G. Sun, and E. Wu, "Squeeze-and-Excitation Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 8, pp. 2011-2023, 2020, doi: 10.1109/TPAMI.2019.2913372.
    [41] X. Li, W. Wang, X. Hu, and J. Yang, "Selective Kernel Networks," in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 15-20 June 2019 2019, pp. 510-519, doi: 10.1109/CVPR.2019.00060.
    [42] M. D. Zeiler and R. Fergus, "Visualizing and Understanding Convolutional Networks," arXiv e-prints, p. arXiv:1311.2901, 2013.
    [43] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, "Densely Connected Convolutional Networks," arXiv e-prints, p. arXiv:1608.06993, 2016.
    [44] J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," arXiv e-prints, p. arXiv:1612.08242, 2016.
    [45] A. Krizhevsky, I. Sutskever, and G. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Neural Information Processing Systems, vol. 25, 01/01 2012, doi: 10.1145/3065386.
    [46] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, "Feature Pyramid Networks for Object Detection," arXiv e-prints, p. arXiv:1612.03144, 2016.
    [47] S. Ruder, "An overview of gradient descent optimization algorithms," arXiv e-prints, p. arXiv:1609.04747, 2016.

    QR CODE