簡易檢索 / 詳目顯示

研究生: 陳冠宇
Kuan-Yu Chan
論文名稱: 基於雙模型一致性與資料增強在半監督學習之錯誤標註效能提昇技術
Robust Semi-Supervised Learning on Noisy Labels with Multi-Consistency and Data Augmentation
指導教授: 郭景明
Jing-Ming Guo
口試委員: 李宗南
Chung-Nan Lee
李佩君
Pei-Jun Lee
夏至賢
Chih-Hsien Hsia
徐位文
Wei-Wen Hsu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 97
中文關鍵詞: 半監督學習雙模型一致性資料增強噪音學習Animal-10NClothing1M
外文關鍵詞: Semi-Supervised Learning, Multi-consistency, Augmentation, Noisy labels Learning, Animal-10N, Clothing1M
相關次數: 點閱:277下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

在過去的深度學習研究中,大部分的模型訓練需要依靠大量且正確標註的資料才能得到較好的效能;然而,一旦資料集中存有部分錯誤的標註資料,將可能嚴重影響到模型的準確率。在影像分類任務中,常見的深度學習方法是利用大量的標註資料透過監督式學習讓模型從樣本中學習到各類別資料的特徵,而通常正確的分類需要耗費龐大的人力成本去重新標註。為了有效降低在標註上的成本,半監督式學習演算法在實際應用上具有很大的研發動機。其主要是透過少量的標註資料與大量的無標註資料對模型進行訓練,再透過資料間的特徵比對進一步提升模型性能。
在本論文中的實驗使用兩種公開的噪音資料集,分別為Clothing1M與 Animal-10N,這兩種數據集都是透過網路爬蟲所生成的,因此其標籤的錯誤程度是無法估計的。對此本論文主要研究如何在有錯誤標註資料的條件下讓訓練模型具有更好的準確度。以半監督式學習的角度來設計,將預熱模型輸出的結果依據其損失進行分類,損失較小的視為標註資料,較大的則為錯誤標註的資料集因此視為無標註資料,再藉由雙模型一致性對無標註的資料進行近一步的分類,從資料集中選出致性程度較高的為資料並給予其偽標籤進行訓練,其餘的視為噪音資料不進行訓練。如此一來,便形成雙重篩選機制,透過雙重篩選機制可以從無標註資料集中挑選出噪音的資料樣本,降低噪音資料與錯誤標註的影響。另一方面,為了提高模型的強健性,將篩選出的乾淨資料集與雙模型一致性的所挑選出透過模型給與偽標籤的無標註資料會加入強資料增強並透過Mixup演算法結合訓練,以進一步提高分類效能。
實驗結果顯示,藉由所提出的資料集去噪方法並強化資料集的運用能有效提高在Clothing1M與當前最前沿的方法約0.1%的準確率,而在Animal-10N與當前最前沿的方法高出約2%的準確率。論文中所提出的方法能(1)在訓練階段導入強資料增強增加模型的強健性,(2)利用雙模型一致性機制能降低噪音資料對模型訓練的影響,(3)藉由半監督式學習的演算法進行資料訓練以提高準確率。


In deep learning, most of the former model training relied on a large scale of correctly annotated data to achieve promising performance. However, if any annotated data were incorrect, the accuracy of the model may be seriously affected. In image classification tasks, the deep learning models learn class-relevant features from the big data under supervised learning, and it causes a high labor cost to classify each sample correctly to generate ground truth labels. To reduce the labeling cost, the semi-supervised learning algorithms were proposed to exploit unlabeled data to boost models’ performance by feature matching among samples.
In this study, two public noisy datasets were used, including Clothing1M and Animal-10N. Both datasets are generated by web crawlers, and thus the label quality cannot be estimated. Therefore, the purpose of this study is to tackle the problem of noisy datasets to boost the model performance. The model was designed based on a semi-supervised learning approach. The results of the warm-up model are separated according to their losses, those with small losses are considered as labeled data, while those with big losses are classified as unlabeled data, which are further processed by multi-consistency. The consistent data are selected from the data set and given pseudo-labels for training, while the others are not trained as they are considered as noisy data. Subsequently, the dual screening scheme is proposed to select the noisy data from the unlabeled data, reducing the impact of noisy data and mislabeling. In addition, to improve the robustness of the model, the clean data and the unlabeled data are combined with the strong data augmentation and trained by Mixup algorithm to further enhance the robustness of the models.
Experimental results show that proposed methods can boost the classification performance. The accuracy of is 0.1% higher than the state-of-the-art methods on Clothing1M, and the accuracy is 2% higher than the state-of-the-art methods on Animal-10N. The contributions are summarized as 1) enhanced the model by adding strong data augmentation, 2) using a multi-consistency to reduce the impact of noisy data, and 3) performance boosting by semi-supervised learning.

摘要 I Abstract II 致謝 III 目錄 IV 圖目錄 VI 表目錄 VIII 第一章 緒論 1 1.1 研究背景與動機 1 1.2 論文架構 2 第二章 文獻探討 3 2.1 深度學習架構與特徵萃取 4 2.1.1 類神經網路 5 2.1.2 卷積神經網路 14 2.1.3 卷積神經網路之訓練方法 17 2.1.4 卷積神經網路之發展 21 2.1.5 卷積神經網路之視覺化過程 26 2.2 自監督與半監督模型介紹 28 2.2.1 自監督模型iBOT[32] 29 2.2.2 半監督模型MixMatch [35] 31 2.2.3 半監督模型FixMatch[42] 34 2.3 資料增強 38 2.3.1 資料增強AutoAugment[47] 38 2.3.2 資料增強RandAugment[43] 41 2.3.3 資料增強Augmix[48] 42 2.4 基於錯誤標註影像的分類 44 2.4.1 抗噪學習Dividemix [49] 44 2.4.2 抗噪學習NCR [58] 46 第三章 基於雙模型一致性與資料增強在半監督學習之錯誤標註效能提升技術 49 3.1 架構流程圖 49 3.2 資料集組成 50 3.2.1 Clothing1M 50 3.2.2 Animal-10N 53 3.3 共同分割(Co-Divide)的相關說明 55 3.4 共同猜測(Co-Guessing)的相關說明 55 3.5 混合訓練說明 57 第四章 實驗結果 61 4.1 測試環境 61 4.2 測試階段與評估指標 61 4.3 網路與訓練參數設置 62 4.4 消融實驗 63 4.5 實驗結果分析 64 4.5.1 與過往論文比較 64 4.5.2 實驗結果分析 67 第五章 結論與未來展望 81 參考文獻 82

[1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[2] M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks," in European conference on computer vision, 2014: Springer, pp. 818-833.
[3] M.-C. Popescu, V. E. Balas, L. Perescu-Popescu, and N. Mastorakis, "Multilayer perceptron and neural networks," WSEAS Transactions on Circuits and Systems, vol. 8, no. 7, pp. 579-588, 2009.
[4] R. Pascanu, T. Mikolov, and Y. Bengio, "On the difficulty of training recurrent neural networks," in International conference on machine learning, 2013: PMLR, pp. 1310-1318.
[5] V. Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines," in Icml, 2010.
[6] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, "Fast and accurate deep network learning by exponential linear units (elus)," arXiv preprint arXiv:1511.07289, 2015.
[7] W. Shang, K. Sohn, D. Almeida, and H. Lee, "Understanding and improving convolutional neural networks via concatenated rectified linear units," in international conference on machine learning, 2016: PMLR, pp. 2217-2225.
[8] G. Klambauer, T. Unterthiner, A. Mayr, and S. Hochreiter, "Self-normalizing neural networks," arXiv preprint arXiv:1706.02515, 2017.
[9] B. Xu, N. Wang, T. Chen, and M. Li, "Empirical evaluation of rectified activations in convolutional network," arXiv preprint arXiv:1505.00853, 2015.
[10] X. Glorot, A. Bordes, and Y. Bengio, "Deep sparse rectifier neural networks," in Proceedings of the fourteenth international conference on artificial intelligence and statistics, 2011: JMLR Workshop and Conference Proceedings, pp. 315-323.
[11] C. Gulcehre, M. Moczulski, M. Denil, and Y. Bengio, "Noisy activation functions," in International conference on machine learning, 2016: PMLR, pp. 3059-3068.
[12] M. D. Zeiler and R. Fergus, "Stochastic pooling for regularization of deep convolutional neural networks," arXiv preprint arXiv:1301.3557, 2013.
[13] C. Gulcehre, K. Cho, R. Pascanu, and Y. Bengio, "Learned-norm pooling for deep feedforward and recurrent neural networks," in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2014: Springer, pp. 530-546.
[14] S. Ruder, "An overview of gradient descent optimization algorithms," arXiv preprint arXiv:1609.04747, 2016.
[15] I. Sutskever, J. Martens, G. Dahl, and G. Hinton, "On the importance of initialization and momentum in deep learning," in International conference on machine learning, 2013: PMLR, pp. 1139-1147.
[16] A. Botev, G. Lever, and D. Barber, "Nesterov's accelerated gradient and momentum as approximations to regularised update descent," in 2017 International Joint Conference on Neural Networks (IJCNN), 2017: IEEE, pp. 1899-1903.
[17] Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, "3D U-Net: learning dense volumetric segmentation from sparse annotation," in International conference on medical image computing and computer-assisted intervention, 2016: Springer, pp. 424-432.
[18] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
[19] C. Shorten and T. M. Khoshgoftaar, "A survey on image data augmentation for deep learning," Journal of Big Data, vol. 6, no. 1, pp. 1-48, 2019.
[20] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, vol. 25, pp. 1097-1105, 2012.
[21] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database," in 2009 IEEE conference on computer vision and pattern recognition, 2009: Ieee, pp. 248-255.
[22] G. H. Dunteman, Principal components analysis (no. 69). Sage, 1989.
[23] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: a simple way to prevent neural networks from overfitting," The journal of machine learning research, vol. 15, no. 1, pp. 1929-1958, 2014.
[24] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
[25] C. Szegedy et al., "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1-9.
[26] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[27] M. Tan and Q. V. Le, "Efficientnet: Rethinking model scaling for convolutional neural networks," arXiv preprint arXiv:1905.11946, 2019.
[28] M. Abadi et al., "Tensorflow: A system for large-scale machine learning," in 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16), 2016, pp. 265-283.
[29] Y. Jia et al., "Caffe: Convolutional architecture for fast feature embedding," in Proceedings of the 22nd ACM international conference on Multimedia, 2014, pp. 675-678.
[30] A. Paszke et al., "Pytorch: An imperative style, high-performance deep learning library," arXiv preprint arXiv:1912.01703, 2019.
[31] T.-Y. Lin et al., "Microsoft coco: Common objects in context," in European conference on computer vision, 2014: Springer, pp. 740-755.
[32] J. Zhou et al., "ibot: Image bert pre-training with online tokenizer," arXiv preprint arXiv:2111.07832, 2021.
[33] H. Bao, L. Dong, and F. Wei, "Beit: Bert pre-training of image transformers," arXiv preprint arXiv:2106.08254, 2021.
[34] M. Caron et al., "Emerging properties in self-supervised vision transformers," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 9650-9660.
[35] D. Berthelot, N. Carlini, I. Goodfellow, N. Papernot, A. Oliver, and C. A. Raffel, "Mixmatch: A holistic approach to semi-supervised learning," Advances in neural information processing systems, vol. 32, 2019.
[36] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, "mixup: Beyond empirical risk minimization," arXiv preprint arXiv:1710.09412, 2017.
[37] S. Laine and T. Aila, "Temporal ensembling for semi-supervised learning," arXiv preprint arXiv:1610.02242, 2016.
[38] M. Sajjadi, M. Javanmardi, and T. Tasdizen, "Regularization with stochastic transformations and perturbations for deep semi-supervised learning," Advances in neural information processing systems, vol. 29, 2016.
[39] D.-H. Lee, "Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks," in Workshop on challenges in representation learning, ICML, 2013, vol. 3, no. 2, p. 896.
[40] T. Miyato, S.-i. Maeda, M. Koyama, and S. Ishii, "Virtual adversarial training: a regularization method for supervised and semi-supervised learning," IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 8, pp. 1979-1993, 2018.
[41] A. Tarvainen and H. Valpola, "Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results," Advances in neural information processing systems, vol. 30, 2017.
[42] K. Sohn et al., "Fixmatch: Simplifying semi-supervised learning with consistency and confidence," Advances in neural information processing systems, vol. 33, pp. 596-608, 2020.
[43] E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le, "Randaugment: Practical automated data augmentation with a reduced search space," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 702-703.
[44] T. DeVries and G. W. Taylor, "Improved regularization of convolutional neural networks with cutout," arXiv preprint arXiv:1708.04552, 2017.
[45] Q. Xie, Z. Dai, E. Hovy, M. Luong, and Q. Le, "Unsupervised data augmentation for consistency training. arXiv 2019," arXiv preprint arXiv:1904.12848, 1904.
[46] D. Berthelot et al., "Remixmatch: Semi-supervised learning with distribution alignment and augmentation anchoring," arXiv preprint arXiv:1911.09785, 2019.
[47] E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, and Q. V. Le, "Autoaugment: Learning augmentation strategies from data," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 113-123.
[48] D. Hendrycks, N. Mu, E. D. Cubuk, B. Zoph, J. Gilmer, and B. Lakshminarayanan, "Augmix: A simple data processing method to improve robustness and uncertainty," arXiv preprint arXiv:1912.02781, 2019.
[49] J. Li, R. Socher, and S. C. Hoi, "Dividemix: Learning with noisy labels as semi-supervised learning," arXiv preprint arXiv:2002.07394, 2020.
[50] S. Reed, H. Lee, D. Anguelov, C. Szegedy, D. Erhan, and A. Rabinovich, "Training deep neural networks on noisy labels with bootstrapping," arXiv preprint arXiv:1412.6596, 2014.
[51] G. Patrini, A. Rozza, A. Krishna Menon, R. Nock, and L. Qu, "Making deep neural networks robust to label noise: A loss correction approach," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1944-1952.
[52] X. Yu, B. Han, J. Yao, G. Niu, I. Tsang, and M. Sugiyama, "How does disagreement help generalization against label corruption?," in International Conference on Machine Learning, 2019: PMLR, pp. 7164-7173.
[53] K. Yi and J. Wu, "Probabilistic end-to-end noise correction for learning with noisy labels," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7017-7025.
[54] J. Li, Y. Wong, Q. Zhao, and M. S. Kankanhalli, "Learning to learn from noisy labeled data," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5051-5059.
[55] E. Arazo, D. Ortego, P. Albert, N. O’Connor, and K. McGuinness, "Unsupervised label noise modeling and loss correction," in International conference on machine learning, 2019: PMLR, pp. 312-321.
[56] D. Tanaka, D. Ikami, T. Yamasaki, and K. Aizawa, "Joint optimization framework for learning with noisy labels," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 5552-5560.
[57] W. Zhang, Y. Wang, and Y. Qiao, "Metacleaner: Learning to hallucinate clean representations for noisy-labeled visual recognition," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7373-7382.
[58] A. Iscen, J. Valmadre, A. Arnab, and C. Schmid, "Learning with Neighbor Consistency for Noisy Labels," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 4672-4681.
[59] D. Ortego, E. Arazo, P. Albert, N. E. O'Connor, and K. McGuinness, "Multi-objective interpolation training for robustness to label noise," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 6606-6615.
[60] S. Liu, J. Niles-Weed, N. Razavian, and C. Fernandez-Granda, "Early-learning regularization prevents memorization of noisy labels," Advances in neural information processing systems, vol. 33, pp. 20331-20342, 2020.
[61] K.-H. Lee, X. He, L. Zhang, and L. Yang, "Cleannet: Transfer learning for scalable image classifier training with label noise," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 5447-5456.
[62] F. R. Cordeiro, R. Sachdeva, V. Belagiannis, I. Reid, and G. Carneiro, "Longremix: Robust learning with high confidence samples in a noisy label environment," arXiv preprint arXiv:2103.04173, 2021.
[63] Y. Chen, X. Shen, S. X. Hu, and J. A. Suykens, "Boosting co-teaching with compression regularization for label noise," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 2688-2692.
[64] C. Feng, G. Tzimiropoulos, and I. Patras, "S3: Supervised Self-supervised Learning under Label Noise," arXiv preprint arXiv:2111.11288, 2021.
[65] K. Nishi, Y. Ding, A. Rich, and T. Hollerer, "Augmentation strategies for learning with noisy labels," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8022-8031.
[66] W. Chen, C. Zhu, and Y. Chen, "Sample Prior Guided Robust Model Learning to Suppress Noisy Labels," arXiv preprint arXiv:2112.01197, 2021.
[67] H. Song, M. Kim, and J.-G. Lee, "Selfie: Refurbishing unclean samples for robust deep learning," in International Conference on Machine Learning, 2019: PMLR, pp. 5907-5915.
[68] Y. Zhang, S. Zheng, P. Wu, M. Goswami, and C. Chen, "Learning with feature-dependent label noise: A progressive approach," arXiv preprint arXiv:2103.07756, 2021.

無法下載圖示 全文公開日期 2024/08/26 (校內網路)
全文公開日期 2024/08/26 (校外網路)
全文公開日期 2024/08/26 (國家圖書館:臺灣博碩士論文系統)
QR CODE