利用遷移式學習技術應用於組內差異性高的資料集分類方法之研究

簡易檢索 / 詳目顯示

回結果列表

研究生：	洪賀順 Ho-shun Hung
論文名稱：	利用遷移式學習技術應用於組內差異性高的資料集分類方法之研究 Classification with High Intra-Class Variation: A Transfer Learning Approach
指導教授：	鮑興國 Hsing-Kuo Pao
口試委員:	李育杰 Yuh-Jye Lee 劉庭祿 Tyng-Luh Liu 楊傳凱 Chuan-Kai Yang
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2012
畢業學年度：	100
語文別：	英文
論文頁數：	42
中文關鍵詞：	遷移式學習、TrAdaBoost 、組內差異性高資料集
外文關鍵詞：	transfer learning, TrAdaBoost, high intra-class variation data
相關次數：	點閱：169 下載：2
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

我們提出了一個利用遷移式學習技術的二元分類方法處理組內差異性高的資料集，我們都知道對組內差異性高的資料集如果收集到的數量不多，通常分類結果不會太好，原因是資料集內部可能又被細分為很多類，這種情況下，每一類收集到的數量變的更少，對於傳統應用資料實例的分類方法如Boosting、AdaBoost，訓練過程中容易得到錯誤率高於二分之一的弱分類器，使得訓練過程中止得到沒有用的結果，我們提出了一個利用遷移式學習的技術來有效率的整合在組內差異性高的資料集中有用的資訊，並進而得到良好的分類效果，在我們的方法中，我們利用類似TrAdaBoost的概念，把資料妥善的分配在目標域和來源域，最後我們利用遷移式的概念選擇出在來源域有用的資訊和目標域的資訊做結合，這樣一來我們就能收集到相對比較豐富的資料，和TrAdaBoost不同的是，在來源域我們不再只是一味地降低資料的權重，這種方式可以使我們收集到更多有用的資料，我們提出的方法主要的貢獻有兩個地方，一方面我們能成功地處裡組內差異性高的資料集所面臨的分類問題，另一方面當TrAdaBoost的目標域和來源域由組內差異性高的資料集所組成時我們能提高它的效能，實驗結果指出我們提出的方法針對這種類型的資料集其分類結果出來的正確率高於一般的分類方法，而且對於TrAdaBoost來說也有比較好的結果。

We proposed a method to deal with the classiﬁcation of high intra-
class variation data based on a transfer learning approach. The high
intra-class variation is diﬃcult to model especially when we only have
a limited dataset. In this case, a single concept may consist of several
diverse sub-concepts and each concept has only very few samples. The
boosting or Adaboost, for instance, can not help much in this case be-
cause we may easily produce a weak classiﬁer that gives error rate higher
than one half and as a result, the boosting procedure will halt. We pro-
pose a transfer learning approach to eﬀectively integrate the information
from high-variation samples for a successful modeling. In our approach,
we put samples of high variation into the source and target domains, as in
the design of TrAdaboost; then gradually, we select some useful data from
the source domain and combine them with the data in the target domain
to form a rich set for training. What is diﬀerent from the TrAdaboost
is that in our approach, the weight of data in the source domain is not
necessarily decreased as always; therefore, we can collect more useful data
from the source domain based on the proposed method than based on the
typical TrAdaboost. Our contribution is twofold: on one hand, we can
successfully deal with high intra-class variation data; on the other hand,
we can also improve the performance of TrAdaboost, when the data in the
source and target domains are with high variation. The experiment result
shows that the proposed method can achieve higher accuracy than that
of other classiﬁcation method such as Adaboost for the classiﬁcation ofhigh intra-class variation data; moreover, the proposed method performs
better than that of TrAdaboost for the same types of data.

 Introduction 1
1   Problem Proposed   .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .      1
2   Research Framework   .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .      3
3   Thesis Outline .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .      4
 Background 6
1   AdaBoost Algorithm  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .      8
2   Transfer Adaboost (TrAdaBoost) Algorithm  .  .  .  .  .  .  .  .      9
 Methodology 12
1   High Intra-Class Variation Data  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .    12
2   Classiﬁcation with High Intra-Class Variation Data   .  .  .  .    13
3   Drawback of the TrAdaBoost   .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .    14
4   Data Selecion  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .    15
I4   Experiment Results 20
1   Feature Extraction   .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .    21
1.1    Manifold learning  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .    21
1.2    ISOMAP   .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .    21
2   Experiment : Two-Dimensional Synthetic Data   .  .  .  .  .  .    22
3   Experiment : Curved or Linear ?   .  .  .  .  .  .  .  .  .  .  .  .  .  .    27
3.1    Detection of the Curved and Linear Alphabet   .  .  .    27
3.2    Detection of the Curved and Linear Digital Numbers  29
4   Experiment : Apply to the Human Face Problem   .  .  .  .  .    33
 Conclusion and Future Work 38
1   Conclusion .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .    38
2   Future work  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .    39

                                

[1] B. Babenko, M.-H. Yang, and S. Belongie. Visual tracking with online
multiple instance learning. In Computer Vision and Pattern Recogni-
tion, 2009. CVPR 2009. IEEE Conference on, pages 983 –990, june
2009.
[2] T. F. Cox and M. A. A. Cox. Multidimensional Scaling, Second Edi-
tion. Chapman and Hall/CRC, 2 edition, Sept. 2000.
[3] W. Dai, Q. Yang, G.-R. Xue, and Y. Yu. Boosting for transfer learn-
ing. In Proceedings of the 24th international conference on Machine
learning, ICML ’07, pages 193–200, New York, NY, USA, 2007. ACM.
[4] O. Danielsson, B. Rasolzadeh, and S. Carlsson. Gated classiﬁers:
Boosting under high intra-class variation. In Computer Vision and
Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 2673
–2680, june 2011.
[5] E. W. Dijkstra. A Note on Two Problems in Connection with Graphs.
Numerical Mathematics, 1:269–271, 1959.
[6] R. Elwell and R. Polikar. Incremental learning of concept drift in
nonstationary environments. Neural Networks, IEEE Transactions
on, 22(10):1517 –1531, oct. 2011.
[7] Y. Freund and R. E. Schapire. A short introduction to boosting.
In In Proceedings of the Sixteenth International Joint Conference on
Artiﬁcial Intelligence, pages 1401–1406. Morgan Kaufmann, 1999.
[8] H. Grabner and H. Bischof. Online boosting and vision, 2006.
[9] H. Kleiman. The ﬂoyd-warshall algorithm, the ap and the tsp, part
ii. CoRR, math.CO/0112052, 2001.
[10] R. Klinkenberg. Learning drifting concepts: Example selection vs.
example weighting. Intell. Data Anal., 8(3):281–300, Aug. 2004.
[11] N. D. Lawrence and J. C. Platt. Learning to learn with the informa-
tive vector machine. In Proceedings of the twenty-ﬁrst international
41conference on Machine learning, ICML ’04, pages 65–, New York,
NY, USA, 2004. ACM.
[12] S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Trans.
on Knowl. and Data Eng., 22:1345–1359, October 2010.
[13] M. Scholz and R. Klinkenberg. Boosting classiﬁers for drifting con-
cepts. Intell. Data Anal., 11(1):3–28, Jan. 2007.
[14] J. B. Tenenbaum, V. d. Silva, and J. C. Langford. A global geo-
metric framework for nonlinear dimensionality reduction. Science,
290(5500):2319–2323, 2000.
[15] K. Q. Weinberger, J. Blitzer, and L. K. Saul. Distance metric learning
for large margin nearest neighbor classiﬁcation. In In NIPS. MIT
Press, 2006.
[16] Y. Yao and G. Doretto. Boosting for transfer learning with multiple
sources. In Computer Vision and Pattern Recognition (CVPR), 2010
IEEE Conference on, pages 1855 –1862, june 2010.
[17] I. Zliobaite. Learning under concept drift: an overview. CoRR,
abs/1010.4784, 2010.

簡易檢索 / 詳目顯示

相關論文