簡易檢索 / 詳目顯示

研究生: 黃鈺軒
Yu-Hsuan Huang
論文名稱: 強化AI重要特徵於鋼材火花類別分類技術開發
Study on Feature Enhanced Deep Learning Networks for Spark based Steel Classification
指導教授: 蘇順豐
Shun-Feng Su
口試委員: 黃有評
Yo-Ping Huang
李祖添
Tsu-Tian Lee
王乃堅
Nai-Jian Wang
王偉彥
Wei-Yen Wang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2023
畢業學年度: 112
語文別: 英文
論文頁數: 63
中文關鍵詞: 火花分類Grad-CAM影像處理影像回歸影像識別火花辨識
外文關鍵詞: Image regression, Spark recognition, Spark classification, Grad-CAM, Image processing, Image recognition
相關次數: 點閱:34下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

在本論文中,我們提出一個結合ResNet-18與Grad-CAM的模型,藉由模型layer1的Grad-CAM輸出與ResNet laer4 的最後一個CNN 層相乘,達到強化火花特徵的效果。為了進一步提升效率與正確率,將layer1的Grad-CAM輸出替換成一個輸入影像經過一組由YOLOv4, Gamma Correction, HSV 和 Dilation 組成的一系列預處理步驟所得到無背景的火花影像mask,達到更精準強化火花特徵的效果,此外本研究採用IoU來評估模型在火花上的注意力。分類模型目前classifiaction 的正確率可達98.4%,IoU分數達37%,相較於原本ResNet-18分別提高了2.9%與10.6%,好的分類正確率配合Grad-CAM可解釋AI分析模型的判斷依據,使本模型更有可信度。本研究還提出使用Image regression 的方法直接分析鋼材的碳含量,雖然藉由回歸的數值做分類的正確率只有75.4%,但由R-Squared的分數0.973可以知道模型在此類回歸任務的可行性。


In this paper, we propose a model that combines ResNet-18 with Grad-CAM. The model enhances spark features by multiplying the Grad-CAM output of layer 1 with the last CNN layer of ResNet layer 4. To further improve efficiency and accuracy, we replaced the Grad-CAM output of layer 1 with a spark image mask obtained through a series of pre-processing steps, including YOLOv4, Gamma Correction, HSV, and Dilation, resulting in a background-free spark image. This replacement aims to enhance spark features more precisely. The current classification accuracy of the model reaches 98.4%, and the IoU score is 37%; compared to the original ResNet-18, there is an increase of 2.9% and 10.6%, respectively. The high classification accuracy, coupled with Grad-CAM for interpreting the AI model's decision-making rationale, enhances the model's credibility. Furthermore, this research introduces the use of image regression to analyze the carbon content of steel directly. While the classification accuracy based on regression values is 75.4%, the high R-squared score of 0.973 indicates the feasibility of the model for regression tasks of this type.

中文摘要 I Abstract II 致謝 III Table of Contents IV List of Figures VIII List of Tables X Chapter 1 Introduction 1 1.1 Background 1 1.2 Motivations 2 1.3 Contributions 3 1.4 Thesis Organization 4 Chapter 2 Related Work 5 Chapter 3 Methodology 7 3.1 System Overview 7 3.2 Pre-processing 8 3.2.1 YOLO v4 8 3.2.2 Gamma Correction 9 3.2.3 HSV 11 3.2.4 Dilation 12 3.3 Network Architecture 13 3.3.1 ResNet 13 3.3.2 VGG-16 14 3.3.3 ConvLSTM 15 3.3.4 ResNet-152 Double Output 16 3.3.5 ResNet-18 Only Layer1 17 3.3.6 ResNet-18 Layer1×Layer4 18 3.3.7 ResNet-18×Mask 19 3.3.8 Resnet-18 Regression 21 3.4 Grad-CAM 22 3.5 Loss Function 24 3.5.1 Cross-Entropy 24 3.5.2 Mean Absolute Error 25 Chapter 4 Experiments 26 4.1 Datasets 26 4.1.1 Classification Dataset 26 4.1.2 Regression Dataset 28 4.2 Evaluation Metrics 29 4.2.1 Confusion Matrix 29 4.2.2 IoU 30 4.2.3 R-Squared 31 4.3 Implement Detail 32 4.3.1 Hardware and Software Environment 32 4.3.2 Training Details and Hyper Parameters Settings 32 4.4 Analysis 33 4.4.1 ResNet-152 Double Output 34 4.4.2 ResNet-18 Only Layer1 35 4.4.3 Image Without Background 36 4.4.4 ResNet-18 Layer1×Layer4 36 4.5 Results and Comparison 38 4.5.1 ResNet-18×Mask 38 4.5.2 Confusion Matrix 39 4.5.3 Model Attention 41 4.5.4 Regression’s Results 43 Chapter 5 Conclusions and Future Work 45 5.1 Conclusions 45 5.2 Future work 45 References 47

[1] D. H. Dwivedi, K. Lepkova, and T. Becker, "Carbon steel corrosion: a review of key surface properties and characterization methods," RSC Advances, vol. 7, pp. 4580-4610, 2017.
[2] H. Khan, A. S. Yerramilli, A. D'Oliveira, T. L. Alford, D. C. Boffito, and G. S. Patience, "Experimental methods in chemical engineering: X‐ray diffraction spectroscopy—XRD," The Canadian Journal of Chemical Engineering, vol. 98, no. 6, pp. 1255-1266, 2020.
[3] M. Hemmerlin, R. Meilland, H. Falk, P. Wintjens, and L. Paulard, "Application of vacuum ultraviolet laser-induced breakdown spectrometry for steel analysis—comparison with spark-optical emission spectrometry figures of merit," Spectrochimica Acta Part B: Atomic Spectroscopy, vol. 56, no. 6, pp. 661-669, 2001.
[4] H. Kataoka, Y. Okamoto, T. Matsushita, S. Tsukahara, T. Fujiwara, and K. Wagatsuma, "Magnetic drop-in tungsten boat furnace vaporisation inductively coupled plasma atomic emission spectrometry (MDI-TBF-ICP-AES) for the direct solid sampling of iron and steel," Journal of Analytical Atomic Spectrometry, vol. 23, no. 8, pp. 1108-1111, 2008.
[5] K. Deng, D. Pan, X. Li, and F. Yin, "Spark testing to measure carbon content in carbon steels based on fractal box counting," Measurement, vol. 133, pp. 77-80, 2019.
[6] P. J. P. Kerscher, J. Schmith, E. A. Martins, R. M. de Figueiredo, and A. L. Keller, "Steel type determination by spark test image processing with machine learning," Measurement, vol. 187, 2022.
[7] S. T. D. M.A. Hearst, E. Osuna, J. Platt, and B. Scholkopf, "Support vector machines," IEEE Intelligent Systems and their Applications, vol. 13, no. 4, pp. 18-28, 1998.
[8] M. Belgiu and L. Drăguţ, "Random forest in remote sensing: A review of applications and future directions," ISPRS Journal of Photogrammetry and Remote Sensing, vol. 114, pp. 24-31, 2016.
[9] F. Pedregosa et al., "Scikit-learn: Machine learning in Python," Journal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011.
[10] L. Breiman, "Bagging predictors," Machine learning, vol. 24, pp. 123-140, 1996.
[11] S.-E. Kim, Q.-V. Vu, G. Papazafeiropoulos, Z. Kong, and V.-H. Truong, "Comparison of machine learning algorithms for regression and classification of ultimate load-carrying capacity of steel frames," Steel and Composite Structures, An International Journal, vol. 37, no. 2, pp. 193-209, 2020.
[12] S. M. Azimi, D. Britz, M. Engstler, M. Fritz, and F. Mücklich, "Advanced steel microstructural classification by deep learning methods," Scientific Reports, vol. 8, no. 1, p. 2128, Feb 1 2018.
[13] L. Kaufman and H. Bernstein, "Computer calculation of phase diagrams. With special reference to refractory metals," 1970.
[14] P. Korotaev and A. Yanilkin, "Steels classification by machine learning and Calphad methods," Calphad, vol. 82, p. 102587, 2023.
[15] S. Hakimian, S. Pourrahimi, A.-H. Bouzid, and L. A. Hof, "Application of machine learning for the classification of corrosion behavior in different environments for material selection of stainless steels," Computational Materials Science, vol. 228, p. 112352, 2023.
[16] R. Buzzard, "The utility of the spark test as applied to commercial steels," Bureau of Standards Journal of Research, vol. 11, no. 4, p. 527, 1933.
[17] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, "Yolov4: Optimal speed and accuracy of object detection," arXiv preprint arXiv:2004.10934, 2020.
[18] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
[19] S.-C. Huang, F.-C. Cheng, and Y.-S. Chiu, "Efficient contrast enhancement using adaptive gamma correction with weighting distribution," IEEE, vol. 22, no. 3, pp. 1032-1041, 2012.
[20] G. Saravanan, G. Yamuna, and S. Nandhini, "Real time implementation of RGB to HSV/HSI/HSL and its reverse color space models," in 2016 International Conference on Communication and Signal Processing (ICCSP): IEEE, pp. 462-466, 2016.
[21] D. B. Judd, "Hue saturation and lightness of surface colors with chromatic illumination," JOSA, vol. 30, no. 1, pp. 2-32, 1940.
[22] H. Levkowitz and G. T. Herman, "GLHS: A generalized lightness, hue, and saturation color model," CVGIP: Graphical Models and Image Processing, vol. 55, no. 4, pp. 271-285, 1993.
[23] L. Vincent, "Morphological transformations of binary images with arbitrary structuring elements," Signal Processing, vol. 22, no. 1, pp. 3-23, 1991.
[24] P. Maragos, "Tutorial on advances in morphological image processing and analysis," Optical Engineering, vol. 26, no. 7, pp. 623-632, 1987.
[25] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, "Grad-cam: Visual explanations from deep networks via gradient-based localization," IEEE International Conference on Computer Vision, pp. 618-626,2017.
[26] A. Yilmaz, O. Javed, and M. Shah, "Object tracking: A survey," Acm Computing Surveys (CSUR), vol. 38, no. 4, pp. 13-es, 2006.
[27] A. Ajmal, C. Hollitt, and M. Frean, "Active shift attention based object tracking system," in 2017 International Conference on Image and Vision Computing New Zealand (IVCNZ): IEEE, pp. 1-5, 2017.
[28] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv Preprint arXiv:1409.1556, 2014.
[29] X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, and W.-c. Woo, "Convolutional LSTM network: A machine learning approach for precipitation nowcasting," Advances in Neural Information Processing Systems, vol. 28, pp. 802-810, 2015.
[30] J.-H. CAI, "Interpretability Analysis in Deep Learning Model for Textile Coil Defect Detection," Master Thesis , National Taiwan University of Science and Technology. 2023.
[31] V. Santhanam, V. I. Morariu, and L. S. Davis, "Generalized deep image to image regression," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5609-5619, 2017.
[32] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, "Learning deep features for discriminative localization," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921-2929, 2016.
[33] P.-T. De Boer, D. P. Kroese, S. Mannor, and R. Y. Rubinstein, "A tutorial on the cross-entropy method," Annals of Operations Research, vol. 134, pp. 19-67, 2005.
[34] C. J. Willmott et al., "Statistics for the evaluation and comparison of models," Journal of Geophysical Research: Oceans, vol. 90, no. C5, pp. 8995-9005, 1985.
[35] H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, "Generalized intersection over union: A metric and a loss for bounding box regression," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658-666, 2019.
[36] N. R. Draper and H. Smith, "Applied regression analysis," vol. 326, 3rd ed.: John Wiley & Sons, ch. More Geometry of Least Squares, pp. 452-459, 1998.
[37] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database," IEEE Conference on Computer Vision and Pattern Recognition, 2009.
[38] S. Ruder, "An overview of gradient descent optimization algorithms," arXiv Preprint arXiv:1609.04747, 2016.
[39] B. Jiang, R. Luo, J. Mao, T. Xiao, and Y. Jiang, "Acquisition of localization confidence for accurate object detection," Proceedings of the European Conference on Computer Vision (ECCV), pp. 784-799, 2018.
[40] S. Miao, Z. J. Wang, and R. Liao, "A CNN regression approach for real-time 2D/3D registration," IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1352-1363, 2016.
[41] A. Ajmal, C. Hollitt, M. Frean, and H. Al-Sahaf, "A comparison of RGB and HSV colour spaces for visual attention models," International Conference on Image and Vision Computing New Zealand (IVCNZ): IEEE, pp. 1-6, 2018.
[42] D. Chen, F. Hu, G. Nian, and T. Yang, "Deep residual learning for nonlinear regression," Entropy, vol. 22, no. 2, p. 193, Feb 7 2020.
[43] P. Maragos, "Differential morphology and image processing," IEEE Transactions on Image Processing, vol. 5, no. 6, pp. 922-937, 1996.
[44] Y. LeCun, L. Bottou, G. B. Orr, and K.-R. Müller, "Efficient backprop," in Neural networks: Tricks of the Trade: Springer, pp. 9-50, 2002.
[45] S. K. Zhou, B. Georgescu, X. S. Zhou, and D. Comaniciu, "Image based regression using boosting method," in Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, vol. 1: IEEE, pp. 541-548, 2005.
[46] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems, vol. 25, 2012.
[47] T. Chai and R. R. Draxler, "Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature," Geoscientific Model Development, vol. 7, no. 3, pp. 1247-1250, 2014.
[48] W. Zhu, N. Zeng, and N. Wang, "Sensitivity, specificity, accuracy, associated confidence interval and ROC analysis with practical SAS implementations," NESUG Proceedings: Health Care and Life Sciences, Baltimore, Maryland, vol. 19, p. 67, 2010.

無法下載圖示
全文公開日期 2025/01/22 (校外網路)
全文公開日期 2025/01/22 (國家圖書館:臺灣博碩士論文系統)
QR CODE