簡易檢索 / 詳目顯示

研究生: 陳信杰
Hsin-Chieh Chen
論文名稱: 非迭代式主動式學習
Active Learning in Non-Iterative Approach
指導教授: 洪西進
Shi-Jinn Horng
口試委員: 林祝興
Chu-Hsin Lin
楊竹星
Chu-Sing Yang
李正吉
Cheng-Chi Lee
顏成安
Cheng-An Yen
洪西進
Shi-Jinn Horng
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 37
中文關鍵詞: 主動式學習非迭代式資料篩選
外文關鍵詞: Active Learning, Non-Iterative Approach, Data selection
相關次數: 點閱:146下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

隨著深度學習技術的快速發展,深度學習已經廣泛的被許多企業用於各
種用途與產品,然而並非所有的使用場景都有公開的資料集可以使用,因此有大
量的時間與人力成本都須用於標記資料,有鑑於此,主動式學習[1]。技術應運而
生,此技術可以自主地為模型篩選具有學習價值的資料並過濾掉對模型學習貢獻
度低的資料,有效地避免需要大量標記資料,從而大大降低人力與時間成本。
但傳統的主動式學習(Active Learning)[1]架構需要經過多次的迭代(Iteration),
重複標記(Label)、訓練(Training)、篩選資料(Data selection)的流程直到篩選出預
期數量的資料,本研究提出一個無需迭代的非迭代式主動式學習(Non-Iterative
Active Learning)架構,透過搭建一個以 LSTM[2]為基礎的信心度(Confidence)修
正模型,使利用少量資料訓練出的模型,所產出的信心度分布,能夠與使用大量
資料訓練出的模型產出的結果相近,以此種方法提升以 Entropy[3]為基礎的
Selection Function[4]篩選資料時的準確度。
本研究使用主動式學習領域常見的指標數據集 CIFAR10[5]、CIFAR100[6]
進行實驗,實驗結果顯示本研究之方法在 CIFAR10 和 CIFAR100 資料集中,分
別可以只使用 50%與 63%資料就還原出使用完整資料訓練時的準確率,且在
CIFAR10 資料集中以選取 50%資料量為目標的情況下與傳統主動式學習架構相
比,能夠以不低於傳統架構的準確率,但整體執行速度提升 7 倍,同樣地,在
CIFAR100 資料集中以選取 50%資料量為目標的情況下與傳統主動式學習架構
相比也能夠提升 4.7 倍。


With the rapid development of deep learning technology, deep learning has
been widely adopted by many companies for various purposes and products.
However, not all use cases have publicly available datasets to use. As a result, a
significant amount of time and manpower is required for data labeling. In light of
this, active learning techniques have emerged, which autonomously select
valuable data for model learning and filter out data with low contribution to the
model, effectively reducing the need for extensive data labeling and significantly
lowering manpower and time costs.
However, traditional active learning frameworks require multiple iterations,
involving repetitive processes of labeling, training, and data selection until the
desired amount of data is selected. In this study, we propose a non-iterative
active learning framework, eliminating the need for iterations. This framework
utilizes a confidence correction model based on LSTM , which enables the
confidence distribution generated by a model trained on a small amount of data
to be comparable to the results produced by a model trained on a large amount of
data. This method improves the accuracy of data selection using an entropybased selection function .
We conducted experiments using the commonly used benchmark datasets in
the active learning field, CIFAR10 and CIFAR100. The experimental results
demonstrate that our proposed method can achieve accuracy comparable to that
of models trained on complete datasets using only 50% and 63% of the data for
CIFAR10 and CIFAR100, respectively. Moreover, compared to the traditional
active learning framework, when aiming to select 50% of the data on the
CIFAR10 dataset, our method achieves comparable accuracy while improving
the overall execution speed by 7 times. Similarly, when aiming to select 50% of
the data on the CIFAR100 dataset, our method improves the execution speed by
4.7 times compared to the traditional active learning framework.

摘要 I Abstract II 致謝 IV 目錄 V 圖目錄 VIII 表目錄 X 1 緒論 1 1.1 研究動機和目標 1 1.2 研究方法 2 1.3 研究貢獻 3 1.4 本論文之結構 3 2 文獻探討 3 2.1 資料篩選 3 2.2 主動式學習 4 2.3 主動式學習優化相關研究 6 2.4 分類模型信心度相關研究 8 3 研究方法 10 3.1 CIFAR資料集 10 3.2 系統架構與前置作業 11 3.3 建構所需資料集 13 3.4 資料前處裡 15 3.5 信心度分佈變化預測模型 16 3.6 調整信心度與資料篩選 18 4 實驗設計 20 4.1 系統設置 20 4.2 模型性能評估 20 4.3 實驗模型與參數設定 21 4.4 信心度分佈變化預測模型效能評估 22 5 實驗結果與分析 23 5.1 基於CIFAR資料集篩選各比例所達到之成效 23 5.2 與相關研究進行比較 25 5.2.1 與相關研究的準確率比較 25 5.2.2 與相關研究的時間成本比較 26 5.3 信心度分佈變化預測模型效能評估 28 5.3.1 信心度修改前後的JS散度比較 28 5.3.2 信心度修改前後的準確率變化 30 6 結論與未來工作 33 6.1 結論 33 6.2 未來工作 33 參考文獻 35

[1] B. Settles, “Active learning literature survey,” 2009.
[2] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[3] B. Bein, “Entropy,” Best Practice & Research Clinical Anaesthesiology, vol. 20, no. 1, pp. 101–109, 2006.
[4] J. Kremer, K. Steenstrup Pedersen, and C. Igel, “Active learning with support vector machines,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 4, no. 4, pp. 313–326, 2014.
[5] A. Krizhevsky, G. Hinton, and others, “Learning multiple layers of features from tiny images,” 2009.
[6] A. Krizhevsky, V. Nair, and G. Hinton, “Cifar-10 (canadian institute for advanced research),” URL http://www. cs. toronto. edu/kriz/cifar. html, vol. 5, no. 4, p. 1, 2010.
[7] J. MacQueen and others, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, 1967, vol. 1, no. 14, pp. 281–297.
[8] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, and others, “A density-based algorithm for discovering clusters in large spatial databases with noise,” in kdd, 1996, vol. 96, no. 34, pp. 226–231.
[9] J. A. Bilmes and others, “A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models,” International computer science institute, vol. 4, no. 510, p. 126, 1998.
[10] M. Toneva, A. Sordoni, R. T. des Combes, A. Trischler, Y. Bengio, and G. J. Gordon, “An empirical study of example forgetting during deep neural network learning,” arXiv preprint arXiv:1812.05159, 2018.
[11] D. Yoo and I. S. Kweon, “Learning loss for active learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 93–102.
[12] O. Sener and S. Savarese, “Active learning for convolutional neural networks: A core-set approach,” arXiv preprint arXiv:1708.00489, 2017.
[13] C. Coleman et al., “Selection via proxy: Efficient data selection for deep learning,” arXiv preprint arXiv:1906.11829, 2019.
[14] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
[15] K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, 2016, pp. 630–645.
[16] S. Bhatnagar, S. Goyal, D. Tank, and A. Sethi, “Pal: pretext-based active learning,” arXiv preprint arXiv:2010.15947, 2020.
[17] K. Lee, H. Lee, K. Lee, and J. Shin, “Training confidence-calibrated classifiers for detecting out-of-distribution samples,” arXiv preprint arXiv:1711.09325, 2017.
[18] C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” in International conference on machine learning, 2017, pp. 1321–1330.
[19] M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in International conference on machine learning, 2019, pp. 6105–6114.
[20] M. Menéndez, J. Pardo, L. Pardo, and M. Pardo, “The jensen-shannon divergence,” Journal of the Franklin Institute, vol. 334, no. 2, pp. 307–318, 1997.

無法下載圖示
全文公開日期 2073/08/01 (校外網路)
全文公開日期 2073/08/01 (國家圖書館:臺灣博碩士論文系統)
QR CODE