Basic Search / Detailed Display

Author: 黃品瑄
Pin-Syuan - Huang
Thesis Title: 考慮不確定性、多樣性與代表性之統合主動學習法
A Unified Active Learning Approach Considering Uncertainty, Diversity and Representativity
Advisor: 鮑興國
Hsing-Kuo Pao
Committee: 李育杰
Yuh-Jye Lee
楊傳凱
Chuan-Kai Yang
項天瑞
Tien-Ruey Hsiang
Degree: 碩士
Master
Department: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
Thesis Publication Year: 2017
Graduation Academic Year: 105
Language: 英文
Pages: 57
Keywords (in Chinese): 主動學習查詢策略不確定性多樣性代表性
Keywords (in other languages): active learning, query strategy, uncertainty, diversity, representativity
Reference times: Clicks: 304Downloads: 20
Share:
School Collection Retrieve National Library Collection Retrieve Error Report

在大數據的時代下,物聯網成為熱門的研究領域之一。同時,為了有效建立智慧環境,例如:智慧家庭、智慧城市與智慧工廠等。我們需要資料分析及預測的技術來幫助人們建置舒適及便利的環境。然而,在進行資料分析預測時,往往需耗費大量的人工標記成本。因此,我們藉助主動學習法 (Active Learning) 來幫助減少人力成本,並有效地達到良好的資料預測表現。

本篇論文針對主動學習法中的傳統不確定性策略問題,為了避免僅利用此策略而忽略了部分有價值的資料來改善模型預測能力,甚至因為過少的標籤資料而造成過度擬合 (Overfitting) 問題。我們另外結合了多樣性策略,使其同時考慮資料分布來挑選資料作標記以改善模型。我們提出動態調整權衡參數方法,以評估模型穩定性方式調整權衡參數,使其在前期考慮較多的多樣性策略,將資料分布掌握。隨著模型趨於穩定,再轉移注意到不確定性策略,使預測模型能夠更快地達到預期效果。我們另外考慮代表性策略,基於分群演算法來挑選資料點以便快速掌握資料分布,有效節省人工標記成本。


With the rise of big data technology, the Internet of Things (IoT) becomes one of the hot topics recently. It is vital in order to efficiently build smart environments, such as smart home, smart city, smart factory and so on. It requires the technologies of data analysis and prediction to help people build the convenient and comfortable environments. However, it takes plenty of human effort to obtain the labeled data. Therefore, we utilize active learning to effectively minimize the manual labeling costs while aiming to achieve a good data prediction performance.

In this thesis, we focus on the problem of uncertainty strategy in the traditional active learning method. We want to avoid ignoring important data for labeling and improving the model prediction result. Uncertainty strategy easily suffers from the overfitting problem when there is only a small labeled data set. Therefore, we additionally combine it with diversity strategy for exploring the data distribution to select data for labeling and improving model. We propose a dynamic adjustment trade off parameter approach, which adjusts the weighting parameter based on the model stability to effectively consider uncertainty and diversity in different time. We also utilize representativity strategy to explore the data distribution at the beginning. We consider these strategies to quickly achieve the expected model performance, and effectively reduce the manual labeling costs.

Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . . i Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . iii Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . v Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi List of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . xii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Proposed Method . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.1 Active Learning . . . . . . . . . . . . . . . . . . . . . . . 11 3.1.1 Scenarios . . . . . . . . . . . . . . . . . . . . . . 12 3.1.2 Query Strategies . . . . . . . . . . . . . . . . . . 14 3.1.3 Pool-Based Sampling . . . . . . . . . . . . . . . . 15 3.2 Uncertainty and Diversity . . . . . . . . . . . . . . . . . . 16 3.2.1 Uncertainty . . . . . . . . . . . . . . . . . . . . . 17 3.2.2 Diversity . . . . . . . . . . . . . . . . . . . . . . 18 3.2.3 Combination of Uncertainty and Diversity . . . . . 19 3.3 Dynamically Adjusting the Trade-Off Parameter Lambda . 22 3.4 Adding Representativity . . . . . . . . . . . . . . . . . . 23 3.4.1 Clustering-Based Approach . . . . . . . . . . . . 24 3.4.2 Unweighted Pair Group Method with Arithmetic Mean . . . . . . . . . . . . . . . . . . . . . . . . 27 4 Experiments and Results . . . . . . . . . . . . . . . . . . . . . 28 4.1 SVMs Model . . . . . . . . . . . . . . . . . . . . . . . . 28 4.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.4 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . 30 4.5 Experimental Setting . . . . . . . . . . . . . . . . . . . . 30 4.6 Uncertainty and Diversity . . . . . . . . . . . . . . . . . . 31 4.7 Stably Increasing the Trade-Off Parameter Lambda . . . . 33 4.8 Dynamically Adjusting the Trade-Off Parameter Lambda . 35 4.9 Combining the Representativity . . . . . . . . . . . . . . 39 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Letter of Authority . . . . . . . . . . . . . . . . . . . . . . . . . . 45

[1] D. D. Lewis and W. A. Gale, “A sequential algorithm for training text classifiers,” in Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’94, (New York, NY, USA), pp. 3–12, Springer-Verlag New York, Inc., 1994.
[2] S. Tong and D. Koller, “Support vector machine active learning with applications to text classification,” J. Mach. Learn. Res., vol. 2, pp. 45–66, Mar. 2002.
[3] A. J. Joshi, F. Porikli, and N. Papanikolopoulos, “Multi-class active learning for image classification,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2372–2379, June 2009.
[4] K. Brinker, “Incorporating diversity in active learning with support vector machines.,” in ICML (T. Fawcett and N. Mishra, eds.), pp. 59–66, AAAI Press, 2003.
[5] B. Demir, C. Persello, and L. Bruzzone, “Batch-mode active-learning methods for the interactive classification of remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 49, pp. 1014–1031, March 2011.
[6] S. Chakraborty, V. Balasubramanian, and S. Panchanathan, “Adaptive batch mode active learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 26, pp. 1747–1760, Aug 2015.
[7] J. Kremer, K. Steenstrup Pedersen, and C. Igel, “Active learning with support vector machines,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 4, no. 4, pp. 313–326, 2014.
[8] S. Dasgupta and D. Hsu, “Hierarchical sampling for active learning,” in Proceedings of the 25th International Conference on Machine Learning, ICML ’08, (New York, NY, USA), pp. 208–215, ACM, 2008.
[9] J. Kang, K. R. Ryu, and H.-C. Kwon, Using Cluster-Based Sampling to Select Initial Training Set for Active Learning in Text Classification, pp. 384–388. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004.
[10] H. T. Nguyen and A. Smeulders, “Active learning using pre-clustering,” in Proceedings of the Twenty-first International Conference on Machine Learning, ICML ’04, (New York, NY, USA), pp. 79, ACM, 2004.
[11] D. Shen, J. Zhang, J. Su, G. Zhou, and C.-L. Tan, “Multi-criteria-based active learning for named entity recognition,” in Proceedings of the 42Nd Annual Meeting on Association for Computational Linguistics, ACL ’04, (Stroudsburg, PA, USA), Association for Computational Linguistics, 2004.
[12] Y. Fu, X. Zhu, and B. Li, “A survey on instance selection for active learning,” Knowledge and Information Systems, vol. 35, no. 2, pp. 249–283, 2013.
[13] D. Tuia, F. Ratle, F. Pacifici, M. F. Kanevski, and W. J. Emery, “Active learning methods for remote sensing image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 47, pp. 2218–2232, July 2009.
[14] P. Donmez and J. G. Carbonell, “Proactive learning: Cost-sensitive active learning with multiple imperfect oracles,” in Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM ’08, (New York, NY, USA), pp. 619–628, ACM, 2008.
[15] C. Persello, A. Boularias, M. Dalponte, T. Gobakken, E. Næsset, and B. Schölkopf, “Cost-sensitive active learning with lookahead: Optimizing field surveys for remote sensing data classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 52, pp. 6652–6664, Oct 2014.
[16] A. Kapoor, E. Horvitz, and S. Basu, “Selective supervision: Guiding supervised learning with decision-theoretic active learning.,” in IJCAI (M. M. Veloso, ed.), pp. 877–882, 2007.
[17] B. Settles, Active Learning. Morgan Claypool, 2012.
[18] P. Melville and R. J. Mooney, “Diverse ensembles for active learning,” in Proceedings of 21st International Conference on Machine Learning (ICML-2004), (Banff, Canada), pp. 584–591, July 2004.
[19] B. Settles, M. Craven, and S. Ray, “Multiple-instance active learning,” in Advances in Neural Information Processing Systems 20 (J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis, eds.), pp. 1289–1296, Curran Associates, Inc., 2008.
[20] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector machines,” ACM Trans. Intell. Syst. Technol., vol. 2, pp. 27:1–27:27, May 2011.
[21] M. Lichman, “UCI machine learning repository,” 2013.
[22] E. Youn and M. K. Jeong, “Class dependent feature scaling method using naive bayes classifier for text datamining,” Pattern Recognition Letters, vol. 30, no. 5, pp. 477 – 485, 2009.

QR CODE