簡易檢索 / 詳目顯示

研究生: 林政翰
Jheng-Han Line
論文名稱: 基於特徵相似度比對及半監督式學習之錯誤標註效能提昇技術
Performance Boosting Mislabels Correction with Semi-Supervised Learning and Deep Feature Similarity Measurements
指導教授: 郭景明
Jing-Ming Guo
口試委員: 郭天穎
Tien-Ying Kuo
鍾國亮
Kuo-Liang Chung
黃志良
Chih-Lyang Hwang
莊仁輝
Jen-Hui Chuang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 77
中文關鍵詞: 半監督學習特徵比對Clothing1M分類MixMatch
外文關鍵詞: Semi-Supervised Learning, Feature Matching, Clothing1M, MixMatch
相關次數: 點閱:190下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在以往的深度學習研究中,大部分的模型訓練需要仰賴大量且乾淨的資料集才能得到較好的效能;然而,一旦資料集中存有部分錯誤的標註資料,將可能嚴重影響到模型的準確率。以影像分類任務為例,一般的深度學習方法是利用監督式學習讓模型從大量的訓練集樣本中學習到各類別資料的特徵,而每個樣本正確的分類標註需要耗費龐大的人力成本。為了有效降低標註成本,半監督式學習演算法在實際應用上具有很大的研發動機。其主要是透過少量的標註資料與大量的無標註資料進行模型訓練,再透過資料間的特徵比對進一步提升模型性能。
    在本論文中的實驗使用公開資料集Clothing1M,該資料集包含了具有乾淨標註的少量訓練集、驗證集與測試集,以及其餘一百萬張含有錯誤標註的訓練集,其錯誤程度無法估計。因此本論文主要研究如何在有錯誤標註資料的條件下讓訓練模型具有更好的效能。若以半監督式學習的角度切入,可將資料集中少量的乾淨訓練集視為標註資料,而其餘大量含有錯誤標註的資料集視為無標註資料。先以標註資料集訓練得到初步的分類模型,再利用該模型對無標註的資料進行深度特徵萃取,並藉由特徵比對方法從資料集中選出每個類別內最具特徵代表性的數個樣本作為該類別的prototypes。如此一來,便可比對模型的分類預測與prototypes方法的分類預測形成一雙重篩選機制。若兩者皆是相同的分類預測,則將該資料視為標註資料,反之則視為無標註資料。透過雙重篩選機制可以從錯誤標註資料集中挑選出乾淨的資料樣本,使錯誤標註資料對模型的影響降至最小。另一方面,為了最大化利用資料集,將篩選出的乾淨資料集與其餘可能帶有錯誤標註的資料集透過MixMatch演算法結合訓練,以進一步提高分類效能。實驗結果顯示,藉由所提出的資料集去噪方法並最大化資料集的運用能有效提高分類準確率約3%且高於當前最前沿方法約1%的效能。論文中所提出的方法能(1)有效降低標註成本的需求,(2)利用雙重篩選機制能降低雜訊標註資料對模型訓練的影響,(3)藉由半監督式學習的演算法進行資料訓練以提高準確率。


    For the conventional research using deep learning approaches, the training datasets are expected to be large-scale with correct labels to achieve high performance; however, the performance of the deep learning models may drop significantly if some noisy labels exist in the training datasets. Taking the classification tasks as example, the deep learning models learn class relevant features from the big data under supervised learning, and it causes high labor cost to classify every sample correctly as the ground truth labels. To reduce the labeling cost, the semi-supervised learning algorithms were proposed to exploit unlabeled data to boost models’ performance by feature matching among samples.
    In this study, the public dataset, Clothing1M, was used in the experiments. The dataset comprises a few clean samples for training, validation, and testing, while the remaining around a million of samples are corrupted without knowing the noisy level. Therefore, the purpose of this study is to tackle the problem of noisy datasets to boost the models’ performance. From the perspective of semi-supervised learning, the clean dataset is treated as the labeled dataset, and the remaining noisy data are regarded as the unlabeled data. The initial model was trained on the labeled dataset first, and then the model was used to perform feature extraction on the unlabeled dataset. The “prototypes” for each category can be obtained via feature matching and clustering. As a result, the dual screening scheme is proposed to take the model’s predictions and the predictions from the prototypes method into account, reducing the impact of noisy data. On the other hand, the clean dataset after screening and the rest data with noisy labels were trained by MixMatch to further enhance the robustness of models. Experimental results show that the proposed methods can boost the classification performance by 3% in accuracy, and outperform the state-of-the-art method by 1%. It achieves 1) cost reduction in labeling, 2) impact mitigation of noisy data via the dual screening scheme, and 3) performance boosting by semi-supervised learning.

    摘要 I Abstract II 致謝 III 目錄 IV 圖目錄 VI 表目錄 VIII 第一章 緒論 1 1.1 研究背景與動機 1 1.2 論文架構 2 第二章 文獻探討 3 2.1 深度學習架構與特徵萃取 3 2.1.1 類神經網路 5 2.1.2 卷積神經網路 9 2.1.3 卷積神經網路之訓練方法 13 2.1.4 卷積神經網路之發展 16 2.1.5 卷積神經網路之視覺化過程 21 2.2 自訓練、自監督與半監督模型介紹 23 2.2.1 自監督模型SimSiam [31] 24 2.2.2 自訓練模型Noisy Student [32] 25 2.2.3 半監督模型MixMatch [33] 26 2.3 基於錯誤標註影像的分類 30 2.3.1 抗噪學習CleanNet [34] 30 2.3.2 抗噪學習 Self-Learning [36] 33 2.3.3 抗噪學習Dividemix [38] 35 第三章 利用特徵相似度比對以結合半監督式學習使效能提升 38 3.1 架構流程圖 38 3.2 資料集組成 39 3.3 選取prototypes方法說明 44 第四章 實驗結果 46 4.1 測試環境 46 4.2 測試階段與評估指標 46 4.3 實驗分類數據與prototypes選取結果 46 第五章 結論與未來展望 58 參考文獻 59

    [1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
    [2] M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks," in European conference on computer vision, 2014: Springer, pp. 818-833.
    [3] M.-C. Popescu, V. E. Balas, L. Perescu-Popescu, and N. Mastorakis, "Multilayer perceptron and neural networks," WSEAS Transactions on Circuits and Systems, vol. 8, no. 7, pp. 579-588, 2009.
    [4] R. Pascanu, T. Mikolov, and Y. Bengio, "On the difficulty of training recurrent neural networks," in International conference on machine learning, 2013: PMLR, pp. 1310-1318.
    [5] V. Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines," in Icml, 2010.
    [6] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, "Fast and accurate deep network learning by exponential linear units (elus)," arXiv preprint arXiv:1511.07289, 2015.
    [7] W. Shang, K. Sohn, D. Almeida, and H. Lee, "Understanding and improving convolutional neural networks via concatenated rectified linear units," in international conference on machine learning, 2016: PMLR, pp. 2217-2225.
    [8] G. Klambauer, T. Unterthiner, A. Mayr, and S. Hochreiter, "Self-normalizing neural networks," arXiv preprint arXiv:1706.02515, 2017.
    [9] B. Xu, N. Wang, T. Chen, and M. Li, "Empirical evaluation of rectified activations in convolutional network," arXiv preprint arXiv:1505.00853, 2015.
    [10] X. Glorot, A. Bordes, and Y. Bengio, "Deep sparse rectifier neural networks," in Proceedings of the fourteenth international conference on artificial intelligence and statistics, 2011: JMLR Workshop and Conference Proceedings, pp. 315-323.
    [11] C. Gulcehre, M. Moczulski, M. Denil, and Y. Bengio, "Noisy activation functions," in International conference on machine learning, 2016: PMLR, pp. 3059-3068.
    [12] M. D. Zeiler and R. Fergus, "Stochastic pooling for regularization of deep convolutional neural networks," arXiv preprint arXiv:1301.3557, 2013.
    [13] C. Gulcehre, K. Cho, R. Pascanu, and Y. Bengio, "Learned-norm pooling for deep feedforward and recurrent neural networks," in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2014: Springer, pp. 530-546.
    [14] S. Ruder, "An overview of gradient descent optimization algorithms," arXiv preprint arXiv:1609.04747, 2016.
    [15] I. Sutskever, J. Martens, G. Dahl, and G. Hinton, "On the importance of initialization and momentum in deep learning," in International conference on machine learning, 2013: PMLR, pp. 1139-1147.
    [16] A. Botev, G. Lever, and D. Barber, "Nesterov's accelerated gradient and momentum as approximations to regularised update descent," in 2017 International Joint Conference on Neural Networks (IJCNN), 2017: IEEE, pp. 1899-1903.
    [17] Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, "3D U-Net: learning dense volumetric segmentation from sparse annotation," in International conference on medical image computing and computer-assisted intervention, 2016: Springer, pp. 424-432.
    [18] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
    [19] C. Shorten and T. M. Khoshgoftaar, "A survey on image data augmentation for deep learning," Journal of Big Data, vol. 6, no. 1, pp. 1-48, 2019.
    [20] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, vol. 25, pp. 1097-1105, 2012.
    [21] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database," in 2009 IEEE conference on computer vision and pattern recognition, 2009: Ieee, pp. 248-255.
    [22] G. H. Dunteman, Principal components analysis (no. 69). Sage, 1989.
    [23] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: a simple way to prevent neural networks from overfitting," The journal of machine learning research, vol. 15, no. 1, pp. 1929-1958, 2014.
    [24] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
    [25] C. Szegedy et al., "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1-9.
    [26] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
    [27] M. Abadi et al., "Tensorflow: A system for large-scale machine learning," in 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16), 2016, pp. 265-283.
    [28] Y. Jia et al., "Caffe: Convolutional architecture for fast feature embedding," in Proceedings of the 22nd ACM international conference on Multimedia, 2014, pp. 675-678.
    [29] A. Paszke et al., "Pytorch: An imperative style, high-performance deep learning library," arXiv preprint arXiv:1912.01703, 2019.
    [30] T.-Y. Lin et al., "Microsoft coco: Common objects in context," in European conference on computer vision, 2014: Springer, pp. 740-755.
    [31] X. Chen and K. He, "Exploring simple siamese representation learning," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15750-15758.
    [32] Q. Xie, M.-T. Luong, E. Hovy, and Q. V. Le, "Self-training with noisy student improves imagenet classification," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10687-10698.
    [33] D. Berthelot, N. Carlini, I. Goodfellow, N. Papernot, A. Oliver, and C. Raffel, "Mixmatch: A holistic approach to semi-supervised learning," arXiv preprint arXiv:1905.02249, 2019.
    [34] K.-H. Lee, X. He, L. Zhang, and L. Yang, "Cleannet: Transfer learning for scalable image classifier training with label noise," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5447-5456.
    [35] G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, "KNN model-based approach in classification," in OTM Confederated International Conferences" On the Move to Meaningful Internet Systems", 2003: Springer, pp. 986-996.
    [36] J. Han, P. Luo, and X. Wang, "Deep self-learning from noisy labels," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 5138-5147.
    [37] T. Xiao, T. Xia, Y. Yang, C. Huang, and X. Wang, "Learning from massive noisy labeled data for image classification," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 2691-2699.
    [38] J. Li, R. Socher, and S. C. Hoi, "Dividemix: Learning with noisy labels as semi-supervised learning," arXiv preprint arXiv:2002.07394, 2020.

    無法下載圖示 全文公開日期 2024/09/11 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE