強化AI重要特徵於鋼材火花類別分類技術開發｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	黃鈺軒 Yu-Hsuan Huang
論文名稱：	強化AI重要特徵於鋼材火花類別分類技術開發 Study on Feature Enhanced Deep Learning Networks for Spark based Steel Classification
指導教授：	蘇順豐 Shun-Feng Su
口試委員:	黃有評 Yo-Ping Huang 李祖添 Tsu-Tian Lee 王乃堅 Nai-Jian Wang 王偉彥 Wei-Yen Wang
學位類別：	碩士 Master
系所名稱：	電資學院 - 電機工程系 Department of Electrical Engineering
論文出版年：	2023
畢業學年度：	112
語文別：	英文
論文頁數：	63
中文關鍵詞：	火花分類、Grad-CAM 、影像處理、影像回歸、影像識別、火花辨識
外文關鍵詞：	Image regression, Spark recognition, Spark classification, Grad-CAM, Image processing, Image recognition
相關次數：	點閱：34 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

在本論文中，我們提出一個結合ResNet-18與Grad-CAM的模型，藉由模型layer1的Grad-CAM輸出與ResNet laer4 的最後一個CNN 層相乘，達到強化火花特徵的效果。為了進一步提升效率與正確率，將layer1的Grad-CAM輸出替換成一個輸入影像經過一組由YOLOv4, Gamma Correction, HSV 和 Dilation 組成的一系列預處理步驟所得到無背景的火花影像mask，達到更精準強化火花特徵的效果，此外本研究採用IoU來評估模型在火花上的注意力。分類模型目前classifiaction 的正確率可達98.4%,IoU分數達37%，相較於原本ResNet-18分別提高了2.9%與10.6%，好的分類正確率配合Grad-CAM可解釋AI分析模型的判斷依據，使本模型更有可信度。本研究還提出使用Image regression 的方法直接分析鋼材的碳含量，雖然藉由回歸的數值做分類的正確率只有75.4%，但由R-Squared的分數0.973可以知道模型在此類回歸任務的可行性。

In this paper, we propose a model that combines ResNet-18 with Grad-CAM. The model enhances spark features by multiplying the Grad-CAM output of layer 1 with the last CNN layer of ResNet layer 4. To further improve efficiency and accuracy, we replaced the Grad-CAM output of layer 1 with a spark image mask obtained through a series of pre-processing steps, including YOLOv4, Gamma Correction, HSV, and Dilation, resulting in a background-free spark image. This replacement aims to enhance spark features more precisely. The current classification accuracy of the model reaches 98.4%, and the IoU score is 37%; compared to the original ResNet-18, there is an increase of 2.9% and 10.6%, respectively. The high classification accuracy, coupled with Grad-CAM for interpreting the AI model's decision-making rationale, enhances the model's credibility. Furthermore, this research introduces the use of image regression to analyze the carbon content of steel directly. While the classification accuracy based on regression values is 75.4%, the high R-squared score of 0.973 indicates the feasibility of the model for regression tasks of this type.

中文摘要	I
Abstract	II
致謝	III
Table of Contents	IV
List of Figures	VIII
List of Tables	X
Chapter 1	Introduction	1
1	Background	1
2	Motivations	2
3	Contributions	3
4	Thesis Organization	4
Chapter 2	Related Work	5
Chapter 3	Methodology	7
1	System Overview	7
2	Pre-processing	8
2.1	YOLO v4	8
2.2	Gamma Correction	9
2.3	HSV	11
2.4	Dilation	12
3	Network Architecture	13
3.1	ResNet	13
3.2	VGG-16	14
3.3	ConvLSTM	15
3.4	ResNet-152 Double Output	16
3.5	ResNet-18 Only Layer1	17
3.6	ResNet-18 Layer1×Layer4	18
3.7	ResNet-18×Mask	19
3.8	Resnet-18 Regression	21
4	Grad-CAM	22
5	Loss Function	24
5.1	Cross-Entropy	24
5.2	Mean Absolute Error	25
Chapter 4	Experiments	26
1	Datasets	26
1.1	Classification Dataset	26
1.2	Regression Dataset	28
2	Evaluation Metrics	29
2.1	Confusion Matrix	29
2.2	IoU	30
2.3	R-Squared	31
3	Implement Detail	32
3.1	Hardware and Software Environment	32
3.2	Training Details and Hyper Parameters Settings	32
4	Analysis	33
4.1	ResNet-152 Double Output	34
4.2	ResNet-18 Only Layer1	35
4.3	Image Without Background	36
4.4	ResNet-18 Layer1×Layer4	36
5	Results and Comparison	38
5.1	ResNet-18×Mask	38
5.2	Confusion Matrix	39
5.3	Model Attention	41
5.4	Regression’s Results	43
Chapter 5	Conclusions and Future Work	45
1	Conclusions	45
2	Future work	45
References	47
                                

[1] D. H. Dwivedi, K. Lepkova, and T. Becker, "Carbon steel corrosion: a review of key surface properties and characterization methods," RSC Advances, vol. 7, pp. 4580-4610, 2017.
[2] H. Khan, A. S. Yerramilli, A. D'Oliveira, T. L. Alford, D. C. Boffito, and G. S. Patience, "Experimental methods in chemical engineering: X‐ray diffraction spectroscopy—XRD," The Canadian Journal of Chemical Engineering, vol. 98, no. 6, pp. 1255-1266, 2020.
[3] M. Hemmerlin, R. Meilland, H. Falk, P. Wintjens, and L. Paulard, "Application of vacuum ultraviolet laser-induced breakdown spectrometry for steel analysis—comparison with spark-optical emission spectrometry figures of merit," Spectrochimica Acta Part B: Atomic Spectroscopy, vol. 56, no. 6, pp. 661-669, 2001.
[4] H. Kataoka, Y. Okamoto, T. Matsushita, S. Tsukahara, T. Fujiwara, and K. Wagatsuma, "Magnetic drop-in tungsten boat furnace vaporisation inductively coupled plasma atomic emission spectrometry (MDI-TBF-ICP-AES) for the direct solid sampling of iron and steel," Journal of Analytical Atomic Spectrometry, vol. 23, no. 8, pp. 1108-1111, 2008.
[5] K. Deng, D. Pan, X. Li, and F. Yin, "Spark testing to measure carbon content in carbon steels based on fractal box counting," Measurement, vol. 133, pp. 77-80, 2019.
[6] P. J. P. Kerscher, J. Schmith, E. A. Martins, R. M. de Figueiredo, and A. L. Keller, "Steel type determination by spark test image processing with machine learning," Measurement, vol. 187, 2022.
[7] S. T. D. M.A. Hearst, E. Osuna, J. Platt, and B. Scholkopf, "Support vector machines," IEEE Intelligent Systems and their Applications, vol. 13, no. 4, pp. 18-28, 1998.
[8] M. Belgiu and L. Drăguţ, "Random forest in remote sensing: A review of applications and future directions," ISPRS Journal of Photogrammetry and Remote Sensing, vol. 114, pp. 24-31, 2016.
[9] F. Pedregosa et al., "Scikit-learn: Machine learning in Python," Journal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011.
[10] L. Breiman, "Bagging predictors," Machine learning, vol. 24, pp. 123-140, 1996.
[11] S.-E. Kim, Q.-V. Vu, G. Papazafeiropoulos, Z. Kong, and V.-H. Truong, "Comparison of machine learning algorithms for regression and classification of ultimate load-carrying capacity of steel frames," Steel and Composite Structures, An International Journal, vol. 37, no. 2, pp. 193-209, 2020.
[12] S. M. Azimi, D. Britz, M. Engstler, M. Fritz, and F. Mücklich, "Advanced steel microstructural classification by deep learning methods," Scientific Reports, vol. 8, no. 1, p. 2128, Feb 1 2018.
[13] L. Kaufman and H. Bernstein, "Computer calculation of phase diagrams. With special reference to refractory metals," 1970.
[14] P. Korotaev and A. Yanilkin, "Steels classification by machine learning and Calphad methods," Calphad, vol. 82, p. 102587, 2023.
[15] S. Hakimian, S. Pourrahimi, A.-H. Bouzid, and L. A. Hof, "Application of machine learning for the classification of corrosion behavior in different environments for material selection of stainless steels," Computational Materials Science, vol. 228, p. 112352, 2023.
[16] R. Buzzard, "The utility of the spark test as applied to commercial steels," Bureau of Standards Journal of Research, vol. 11, no. 4, p. 527, 1933.
[17] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, "Yolov4: Optimal speed and accuracy of object detection," arXiv preprint arXiv:2004.10934, 2020.
[18] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
[19] S.-C. Huang, F.-C. Cheng, and Y.-S. Chiu, "Efficient contrast enhancement using adaptive gamma correction with weighting distribution," IEEE, vol. 22, no. 3, pp. 1032-1041, 2012.
[20] G. Saravanan, G. Yamuna, and S. Nandhini, "Real time implementation of RGB to HSV/HSI/HSL and its reverse color space models," in 2016 International Conference on Communication and Signal Processing (ICCSP): IEEE, pp. 462-466, 2016.
[21] D. B. Judd, "Hue saturation and lightness of surface colors with chromatic illumination," JOSA, vol. 30, no. 1, pp. 2-32, 1940.
[22] H. Levkowitz and G. T. Herman, "GLHS: A generalized lightness, hue, and saturation color model," CVGIP: Graphical Models and Image Processing, vol. 55, no. 4, pp. 271-285, 1993.
[23] L. Vincent, "Morphological transformations of binary images with arbitrary structuring elements," Signal Processing, vol. 22, no. 1, pp. 3-23, 1991.
[24] P. Maragos, "Tutorial on advances in morphological image processing and analysis," Optical Engineering, vol. 26, no. 7, pp. 623-632, 1987.
[25] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, "Grad-cam: Visual explanations from deep networks via gradient-based localization," IEEE International Conference on Computer Vision, pp. 618-626,2017.
[26] A. Yilmaz, O. Javed, and M. Shah, "Object tracking: A survey," Acm Computing Surveys (CSUR), vol. 38, no. 4, pp. 13-es, 2006.
[27] A. Ajmal, C. Hollitt, and M. Frean, "Active shift attention based object tracking system," in 2017 International Conference on Image and Vision Computing New Zealand (IVCNZ): IEEE, pp. 1-5, 2017.
[28] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv Preprint arXiv:1409.1556, 2014.
[29] X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, and W.-c. Woo, "Convolutional LSTM network: A machine learning approach for precipitation nowcasting," Advances in Neural Information Processing Systems, vol. 28, pp. 802-810, 2015.
[30] J.-H. CAI, "Interpretability Analysis in Deep Learning Model for Textile Coil Defect Detection," Master Thesis , National Taiwan University of Science and Technology. 2023.
[31] V. Santhanam, V. I. Morariu, and L. S. Davis, "Generalized deep image to image regression," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5609-5619, 2017.
[32] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, "Learning deep features for discriminative localization," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921-2929, 2016.
[33] P.-T. De Boer, D. P. Kroese, S. Mannor, and R. Y. Rubinstein, "A tutorial on the cross-entropy method," Annals of Operations Research, vol. 134, pp. 19-67, 2005.
[34] C. J. Willmott et al., "Statistics for the evaluation and comparison of models," Journal of Geophysical Research: Oceans, vol. 90, no. C5, pp. 8995-9005, 1985.
[35] H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, "Generalized intersection over union: A metric and a loss for bounding box regression," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658-666, 2019.
[36] N. R. Draper and H. Smith, "Applied regression analysis," vol. 326, 3rd ed.: John Wiley & Sons, ch. More Geometry of Least Squares, pp. 452-459, 1998.
[37] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database," IEEE Conference on Computer Vision and Pattern Recognition, 2009.
[38] S. Ruder, "An overview of gradient descent optimization algorithms," arXiv Preprint arXiv:1609.04747, 2016.
[39] B. Jiang, R. Luo, J. Mao, T. Xiao, and Y. Jiang, "Acquisition of localization confidence for accurate object detection," Proceedings of the European Conference on Computer Vision (ECCV), pp. 784-799, 2018.
[40] S. Miao, Z. J. Wang, and R. Liao, "A CNN regression approach for real-time 2D/3D registration," IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1352-1363, 2016.
[41] A. Ajmal, C. Hollitt, M. Frean, and H. Al-Sahaf, "A comparison of RGB and HSV colour spaces for visual attention models," International Conference on Image and Vision Computing New Zealand (IVCNZ): IEEE, pp. 1-6, 2018.
[42] D. Chen, F. Hu, G. Nian, and T. Yang, "Deep residual learning for nonlinear regression," Entropy, vol. 22, no. 2, p. 193, Feb 7 2020.
[43] P. Maragos, "Differential morphology and image processing," IEEE Transactions on Image Processing, vol. 5, no. 6, pp. 922-937, 1996.
[44] Y. LeCun, L. Bottou, G. B. Orr, and K.-R. Müller, "Efficient backprop," in Neural networks: Tricks of the Trade: Springer, pp. 9-50, 2002.
[45] S. K. Zhou, B. Georgescu, X. S. Zhou, and D. Comaniciu, "Image based regression using boosting method," in Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, vol. 1: IEEE, pp. 541-548, 2005.
[46] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems, vol. 25, 2012.
[47] T. Chai and R. R. Draxler, "Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature," Geoscientific Model Development, vol. 7, no. 3, pp. 1247-1250, 2014.
[48] W. Zhu, N. Zeng, and N. Wang, "Sensitivity, specificity, accuracy, associated confidence interval and ROC analysis with practical SAS implementations," NESUG Proceedings: Health Care and Life Sciences, Baltimore, Maryland, vol. 19, p. 67, 2010.

全文公開日期 2025/01/22 (校外網路)
全文公開日期 2025/01/22 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文