包含連續型特徵標註的主動式學習｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	傅枝燁 Zhi-Ye Fu
論文名稱：	包含連續型特徵標註的主動式學習 Active Learning with Numerical Feature Annotation
指導教授：	鮑興國 Hsing-Kuo Pao
口試委員:	李育杰 Yuh-Jye Lee 項天瑞 Tien-Ruey Hsiang
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	68
中文關鍵詞：	主動式學習、連續性特徵、特徵標註、向量量化
外文關鍵詞：	Active Learning, Numerical Feature, Feature Annotation, Vector Quantization
相關次數：	點閱：277 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

正如我們所知，主動學習(Active Learning) 是一種當訓練集中有大量
未標記的資料時，可以以盡可能少的標記成本使模型主動選擇資料
向專家詢問標籤的演算法。然而，大多數研究都集中在主動學習的
查詢策略(Query Strategy)，即模型如何選擇信息量較多的資料。而
有一些研究探討如何同時查詢普通資料和資料中的重要特徵。如雙
重監督主動學習模型（Active Learning Dual Supervision Model）將重
要特徵合成一筆虛假資料，通過這樣的方式將特徵視為正常實例。
但它只能應用於文本和離散資料，因為這樣的特徵有明確含義。
然而，本研究致力於如何對連續型數值資料進行特徵標注。首
先通過向量量化(Vector Quantization) 方法將連續型特徵轉換為離散
型，使得特徵具有明確含義。這樣的特徵可用于合成偽實例。在此
基礎上，我們深入觀察了在雙重監督主動學習模型中被用於合成實
例的特徵，以此來佐證雙重監督模型在訓練早期階段就已發現真
實重要的特徵以及雙重模型中用於計算特徵重要性機制的可靠性。
本研究不僅採用向量量化方法，並進一步討論了向量量化中的兩
種聚類演算法，K 平均(Kmeans)
和高斯混合模型(Gaussian Mixture
Model)。由於不同的聚類演算法適用於不同類型的資料集，因此這
也是本文的研究點之一。

As we know active learning is an algorithm which can select informative
instances for model actively with as little labeling cost as possible, when
there is abundant unlabeled data in training set. However, most researches
focus on query strategy in active learning, namely how the model selects
the informative instances. And some researches are talking about querying
label for instance and feature simultaneously. For instance, they treat
features as normal instance by synthesizing instance with selected features
through Active Learning Dual Supervision (ALDS) Model. But it can only
apply to text and categorical data because their features are meaningful.
So, this research put effort on how to label numerical features simultaneously
in active learning. What we want to do is intuitive, converting
numerical features to categorical by applying vector quantization method.
Then features become meaningful, which means that they can be used to
synthesize pseudo instances. Based on it, we investigate details of features
which are selected to synthesize instances in ALDS to see if ALDS model
finds important features in the early stage. Meanwhile, the more important
features are found by ALDS, the better performance model can get. What’s
more, we are not only adopting vector quantization but also discussing two
kinds of clustering methods in vector quantization. The two we used are
Kmeans
and Gaussian Mixture Model algorithms. We discuss what kind
of data suits for each method respectively.

Contents
Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . . i
Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Proposed Method . . . . . . . . . . . . . . . . . . . . . . 3
3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1 Overall Framework . . . . . . . . . . . . . . . . . . . . . 10
2 Vector Quantization . . . . . . . . . . . . . . . . . . . . . 11
2.1 Kmeans
. . . . . . . . . . . . . . . . . . . . . . 12
2.2 Gaussian Mixture Model . . . . . . . . . . . . . . 14
3 Active Learning . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 Active Learning Scenarios . . . . . . . . . . . . . 17
3.2 Query Strategies . . . . . . . . . . . . . . . . . . 19
4 Active Learning Dual Supervision . . . . . . . . . . . . . 22
4.1 Hierarchical Clustering . . . . . . . . . . . . . . . 22
4.2 Pure Cluster Filtering . . . . . . . . . . . . . . . . 23
4.3 Feature Scoring . . . . . . . . . . . . . . . . . . . 24
4.4 Instance Synthesizing . . . . . . . . . . . . . . . . 27
Experiments and Results . . . . . . . . . . . . . . . . . . . . . 29
1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.1 Pima Indian Diabetes . . . . . . . . . . . . . . . . 30
1.2 NBA . . . . . . . . . . . . . . . . . . . . . . . . 30
1.3 Magic Gamma Telescope . . . . . . . . . . . . . . 31
2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . 32
3 Experiment Setting . . . . . . . . . . . . . . . . . . . . . 32
4 Experiment Evaluation . . . . . . . . . . . . . . . . . . . 33
5 Experiment Result . . . . . . . . . . . . . . . . . . . . . 34
5.1 Adding Feature Annotation on Numerical Data . . 34
5.2 Synthesized Instances in each Data Set . . . . . . 38
5.3 Vector Quantization Method Analysis . . . . . . . 45
5.4 Vector Quantization with Combination . . . . . . 50
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 54
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
                                

References
[1] B. Settles, “Active learning literature survey,” tech. rep., University of WisconsinMadison
Department
of Computer Sciences, 2009.
[2] D. D. Lewis and W. A. Gale, “A sequential algorithm for training text classifiers,” in SIGIR’94,
pp. 3–12, Springer, 1994.
[3] T. Scheffer, C. Decomain, and S. Wrobel, “Active hidden markov models for information extraction,”
in International Symposium on Intelligent Data Analysis, pp. 309–318, Springer, 2001.
[4] C. E. Shannon, “A mathematical theory of communication,” Bell system technical journal, vol. 27,
no. 3, pp. 379–423, 1948.
[5] B. Settles and M. Craven, “An analysis of active learning strategies for sequence labeling tasks,” in
Proceedings of the conference on empirical methods in natural language processing, pp. 1070–1079,
Association for Computational Linguistics, 2008.
[6] N. Roy and A. McCallum, “Toward optimal active learning through monte carlo estimation of error
reduction,” ICML, Williamstown, pp. 441–448, 2001.
[7] B. Demir, C. Persello, and L. Bruzzone, “Batchmode
activelearning
methods for the interactive classification
of remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 49,
no. 3, pp. 1014–1031, 2010.
[8] M. Ducoffe and F. Precioso, “Adversarial active learning for deep networks: a margin based approach,”
arXiv preprint arXiv:1802.09841, 2018.
[9] O. Sener and S. Savarese, “Active learning for convolutional neural networks: A coreset
approach,”
arXiv preprint arXiv:1708.00489, 2017.
[10] A. Chriswanto, A Unified Approach on Active Learning Dual Supervision. PhD thesis, 2017.
[11] H. Raghavan, O. Madani, and R. Jones, “Active learning with feedback on features and instances,”
Journal of Machine Learning Research, vol. 7, no. Aug, pp. 1655–1686, 2006.
[12] H. Raghavan and J. Allan, “An interactive algorithm for asking and incorporating feature feedback into
support vector machines,” in Proceedings of the 30th annual international ACM SIGIR conference on
Research and development in information retrieval, pp. 79–86, ACM, 2007.
[13] G. Druck, B. Settles, and A. McCallum, “Active learning by labeling features,” in Proceedings of
the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1Volume
1,
pp. 81–90, Association for Computational Linguistics, 2009.
[14] W.K.
Wong, I. Oberst, S. Das, T. Moore, S. Stumpf, K. McIntosh, and M. Burnett, “Enduser
feature
labeling: A locallyweighted
regression approach,” in Proceedings of the 16th international conference
on Intelligent user interfaces, pp. 115–124, ACM, 2011.
[15] B. Settles, “Closing the loop: Fast, interactive semisupervised
annotation with queries on features and
instances,” in Proceedings of the conference on empirical methods in natural language processing,
pp. 1467–1478, Association for Computational Linguistics, 2011.
[16] J. Attenberg, P. Melville, and F. Provost, “A unified approach to active dual supervision for labeling
features and examples,” in Joint European Conference on Machine Learning and Knowledge Discovery
in Databases, pp. 40–55, Springer, 2010.
[17] D. Xu and Y. Tian, “A comprehensive survey of clustering algorithms,” Annals of Data Science, vol. 2,
no. 2, pp. 165–193, 2015.
[18] Wikipedia, “Kmeans
clustering.” https://en.wikipedia.org/wiki/K-means_clustering/.
[19] Wikipedia, “Mixture model.” https://en.wikipedia.org/wiki/Mixture_model#Gaussian_
mixture_model/.
[20] D. Müllner, “Modern hierarchical, agglomerative clustering algorithms,” arXiv preprint arXiv:
1109.2378, 2011.
[21] S. Dasgupta and D. Hsu, “Hierarchical sampling for active learning,” in Proceedings of the 25th international
conference on Machine learning, pp. 208–215, ACM, 2008.
[22] M. Lichman, “Uci machine learning repository.” http://archive.ics.uci.edu/ml/, 2013.
[23] D. Cournapeau, “scikitlearn.”
http://scikit-learn.org/stable/index.html, 2007.

全文公開日期 2024/07/01 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文