簡易檢索 / 詳目顯示

研究生: 袁倫大
Lun-Da Yuan
論文名稱: 基於對比式學習之知識蒸餾於瑕疵偵測之應用
Defect Detection with Contrastive Learning Based Knowledge Distillation
指導教授: 郭景明
Jing-Ming Guo
口試委員: 楊士萱
Shih-Hsuan YANG
王乃堅
Nai-Jian Wang
范志鵬
Chih-Peng Fan
黃敬群
Ching-Chun Huang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 75
中文關鍵詞: 瑕疵偵測知識蒸餾對比式學習自監督式學習無監督式學習
外文關鍵詞: defect detection, knowledge distillation, contrastive learning, self-supervised learning, unsupervised learning
相關次數: 點閱:406下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

本論文提出了一種方法,利用對比式學習預訓練模型作為知識蒸餾的輔助工具,以改善瑕疵偵測的性能。本論文採用MVTec資料集,內含15種工業生產影像,其中有10種物件影像、5種材質影像,總計5354個樣本資料中,訓練集佔3629張,測試集則有1725張,當中,訓練集為無瑕疵的正常樣本,測試集則由瑕疵樣本與其對應的標註影像所組成。在模型方面,本論文從對比式學習模型—simCLR模型開始探討,分成從頭訓練(From Scratch)與微調訓練(Fine Tune)兩個面向,經實驗發現,微調方式的損失函數會比從頭訓練的方式要來的低,其Top-1 Accuracy也相對較高。接著,本論文針對反向知識蒸餾瑕疵偵測模型(RD模型)進行修改,將老師編碼器替換為經simCLR模型訓練過的骨幹架構,實驗結果表示,透過微調方式所得到的骨幹架構可以得到平均83.07%的AUC數值;另外,若是採用從頭訓練的骨幹架構,則可以得到平均90.51%的AUC數值,這樣的結果是令人感到意外的。最後,本論文將預訓練的骨幹架構作為反向知識蒸餾瑕疵偵測模型(RD模型)的輔助模型進行訓練,在測試階段時則移除輔助模型,並保留下訓練好的映射模型。經實驗得到,使用微調方式所訓練的骨幹架構可以在測試階段獲得平均91.86%的AUC數值;採用從頭訓練所得到的預訓練骨幹架構在測試階段則可以得到平均92.56%的AUC數值。本論文採用自監督與無監督的學習方式,且僅只使用無瑕疵、無標註的正常樣本進行訓練,這使得本論文所提出的方法更可行、可靠於當前工業生產線上的應用。


This thesis presents a method that employs pre-trained contrastive learning models as auxiliary tools for knowledge distillation to enhance defect detection performance. The study utilizes the MVTec dataset, consisting of 15 types of industrial production images, including 10 object images and 5 texture images. Among the 5354 total samples, the training set comprises 3629 images, while the test set consists of 1725 images. The training set contains normal samples without defects, whereas the test set includes defect samples along with their corresponding annotated images. Regarding the model, the paper begins by exploring the simCLR model, a contrastive learning approach, and investigates two training strategies: training from scratch and fine-tuning. Experimental findings reveal that fine-tuning achieves lower loss functions and relatively higher Top-1 Accuracy compared to training from scratch. Subsequently, modifications are made to the reverse knowledge distillation defect detection model (RD model) by replacing the teacher encoder with a backbone architecture trained by the simCLR model. Surprisingly, the experimental results show that the fine-tuned backbone architecture attains an average AUC value of 83.07%, while the from-scratch trained backbone architecture achieves an average AUC value of 90.51%. Lastly, the paper employs the pre-trained backbone architecture as the auxiliary model during training for the reverse knowledge distillation defect detection model (RD model). The auxiliary model is removed during the testing phase, retaining only the trained projection head. Experimental results demonstrate that the fine-tuned backbone architecture achieves an average AUC value of 91.86% during testing, whereas the from-scratch trained pre-trained backbone architecture achieves an even higher average AUC value of 92.56%. The study adopts self-supervised and unsupervised learning approaches, utilizing only normal samples without defects or annotations for training. These factors contribute to the practicality and reliability of the proposed method in current industrial production line applications.

摘要 I Abstract II 致謝 III 目錄 IV 圖目錄 VI 表目錄 VIII 第一章 緒論 1 1.1 研究背景與動機 1 1.2 瑕疵偵測簡介 2 1.3 瑕疵影像簡介 3 1.4 論文架構 4 第二章 文獻探討 5 2.1 對比式學習 5 2.1.1 孿生網路Siamese Network [1] 9 2.1.2 資料擴增方法與對比式學習模型 [2] 10 2.2 瑕疵偵測知識蒸餾模型 14 2.2.1 知識蒸餾與瑕疵偵測 [3] 15 2.2.2 反向知識蒸餾 [4] 19 2.2.3 反向知識蒸餾與輔助模型 [5] 22 第三章 對比式學習與知識蒸餾瑕疵偵測 25 3.1 架構流程圖 26 3.2 訓練階段模型 (Training Phase Model) 27 3.2.1 simCLR模型(simCLR Model) 27 3.2.2 瑕疵偵測知識蒸餾模型(The Proposed) 30 3.3 測試階段模型 (Inference Phase Model) 34 3.4 資料集組成 35 3.5 正負樣本與資料前處理 42 3.6 餘弦相似度計算說明 42 3.7 測試階段之檢測機制 45 第四章 實驗結果 46 4.1 測試環境 46 4.2 效能評估指標 46 4.2.1 Top-1 Accuracy 46 4.2.2 ROC曲線 (Receiver Operating Characteristic curve) 47 4.2.3 AUC (Area Under Receiver Operating Characteristic) 48 4.3 瑕疵檢測之結果分析 50 4.3.1 第一實驗—simCLR訓練模型之表現 50 4.3.2 第二實驗—對比式學習於知識蒸餾上之模型表現Ⅰ 52 4.3.3 第三實驗—對比式學習於知識蒸餾上之模型表現Ⅱ 55 4.4 瑕疵偵測能力低落分析 58 第五章 結論與未來展望 60 參考文獻 61

[1] Wu, Z., Xiong, Y., Yu, S.X., Lin, D., "Unsupervised feature learning via non-parametric instance discrimination," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3733-3742, 2018.
[2] Ye, M., Zhang, X., Yuen, P.C., Chang, S.F., "Unsupervised Embedding Learning via Invariant and Spreading Instance Feature," in arXiv, arXiv:1904.03436, 2019.
[3] Van den Oord, Aäron, Li, Yazhe, Vinyals, Oriol, "Representation Learning with Contrastive Predictive Coding," in arXiv preprint arXiv: 1807.03748, 2018.
[4] Y. Tian, D. Krishnan, and P. Isola., "Contrastive Multiview Coding," in arXiv preprint arXiv:1906.05849, 2019.
[5] K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, "Momentum contrast for unsupervised visual representation learning," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9726-9735, 2020
[6] X. Chen, H. Fan, R. Girshick, and K. He, "Improved baselines with momentum contrastive learning," in arXiv preprint arXiv:2003.04297, 2020.
[7] M. Caron, I. Misra, J. Mairal, P. Goyal, P. Bojanowski, A. Joulin, "Unsupervised Learning
of Visual Features by Contrasting Cluster Assignments," in arXiv preprint
arXiv:2006.09882, 2020.
[8] J.-B. Grill et al., "Bootstrap your own latent: A new approach to self-supervised learning,"
in Proc. 34th Int. Conf. Neural Inf. Process. Syst., Art. no. 1786, 2020.
[9] M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, A. Joulin, "Emerging
properties in self-supervised vision transformers," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9650-9660, 2021.
[10] Bromley, Jane, Bentz, James W., Bottou, Léon, Guyon, Isabelle, LeCun, Yann, Moore, Cliff, Säckinger, Eduard and Shah, Roopak, "Signature Verification Using A "Siamese" Time Delay Neural Network," IJPRAI 7, no. 4, pp. 669-688, 1993.
[11] Chen, T., Kornblith, S., Norouzi, M., Hinton, G., "A Simple Framework for Contrastive
Learning of Visual Representations," in Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research, no.119, pp.1597-1607, 2020.
[12] C. Bucilua, R. Caruana, and A. Niculescu-Mizil, "Model compression," in Proceedings
of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining, KDD ’06, pp. 535-541, 2006.
[13] G. Hinton, O. Vinyals, and J. Dean, "Distilling the Knowledge in a Neural Network," in
arXiv preprint arXiv:1503.02531, 2015.
[14] Y. Zhang, T. Xiang, T. M. Hospedales, and H. Lu, "Deep mutual learning," in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.
4320-4328, 2018.
[15] J. H. Cho and B. Hariharan, "On the efficacy of knowledge distillation," in Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4794-4802,
2019.
[16] Q. Xie, M.-T. Luong, E. Hovy, Q.-V. Le, "Self-training with noisy student improves ImageNet classification," in Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition, 2020.
[17] C. Yang, L. Xie, S. Qiao, and A. L. Yuille, "Training deep neural networks in generations:
A more tolerant teacher educates better students," in Proc. AAAI Conf. Artif. Intell., pp.
5628–5635, 2019.
[18] M. Phuong and C. H. Lampert, "Distillation-based training for multi-exit architectures,"
in Proc. IEEE Int. Conf. Comput. Vis., pp. 1355-1364, 2019.
[19] J. Yim, D. Joo, J.-H Bae, and J. Kim, "A gift from knowledge distillation: Fast optimization, network minimization and transfer learning," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
[20] F. Tung and G. Mori, "Similarity-preserving knowledge distillation," in Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, pages 1365–1374, 2019.
[21] A. Romero, N. Ballas, S.-E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, "Fitnets: Hints
for thin deep nets," in International Conference on Learning Representations, 2015.
[22] S. Zagoruyko, N. Komodakis, "Paying more attention to attention: improving the
performance of convolutional neural networks via attention transfer," in International
Conference on Learning Representations, 2017.
[23] P. Bergmann, M. Fauser, D. Sattlegger, C. Steger, "Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4183–4192, 2020.
[24] H. Deng and X. Li., "Anomaly detection via reverse distillation from one-class embedding," In Proc.IEEE/CVF Conf. Computer Vision and Pattern Recognition, pp. 9737–9746, 2022.
[25] Tien, T. D., Nguyen, A. T., Tran, N. H., Huy, T. D., Duong, S. T. m, Nguyen, C. D. tr., & Truong, S. Q. h., "Revisiting Reverse Distillation for Anomaly Detection, " in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 24511–24520, 2023.
[26] P. Bergmann, K. Batzner, M. Fauser, D. Sattlegger, C. Steger, "The MVTec Anomaly Detection Dataset: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection," in International Journal of Computer Vision, vol. 129, no. 4, pp.1038-1059, 2021.
[27] P. Bergmann, M. Fauser, D. Sattlegger, C. Steger, "MVTec AD — A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection, " in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9584-9592, 2019.
[28] S. Lee, S. Lee, B.-C. Song, "CFA: coupled-hypersphere based feature adaptation for target-oriented anomaly localization," in IEEE Access, no. 10, pp. 78446–78454, 2022.
[29] K. Roth, L. Pemula, J. Zepeda, B. Scholkopf, T. Brox, and P. Gehler, "Towards total recall in industrial anomaly detection," in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14298–14308, 2022.
[30] Chen, X., He, K., "Exploring simple siamese representation learning," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15750-15758, 2021.
[31] Koch, Gregory, Zemel, Richard and Salakhutdinov, Ruslan, "Siamese Neural Networks for One-shot Image Recognition," in ICML deep learning workshop, vol. 2, 2015.

無法下載圖示 全文公開日期 2025/08/22 (校內網路)
全文公開日期 2025/08/22 (校外網路)
全文公開日期 2025/08/22 (國家圖書館:臺灣博碩士論文系統)
QR CODE