簡易檢索 / 詳目顯示

研究生: 江姵璇
Pei-Hsuan Chiang
論文名稱: 有限時間主動學習法
Limited-Time Active Learning
指導教授: 鮑興國
Hsing-Kuo Kenneth Pao
口試委員: 李育杰
Yuh-Jye Lee
項天瑞
Tien-Ruey Hsiang
鄭欣明
Shin-Ming Cheng
栗永徽
Yung-Hui Li
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2015
畢業學年度: 103
語文別: 英文
論文頁數: 40
中文關鍵詞: 行為辨識時間序列資料主動學習
外文關鍵詞: activity recognition, time-series, active learning
相關次數: 點閱:176下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著大數據時代來臨,物聯網的發展隨之而起,其背景技術被廣為討論及研究。為打造智慧環境,人類行為辨識成為重要的一環,同時也面臨處理大尺度資料的挑戰。此篇論文以主動學習演算法(Active Learning)做為基礎架構,在此架構中針對人類行為辨識等時間序列資料,根據不同場景提出詢問標記的方法,並考量現實生活中,資料蒐集以連續的方式進行,依資料順序來進行訪問,選擇是否納入訓練集並更新模型。

    針對大尺度資料需要大量標記及訓練時間的困難,我們提出有限時間主動學習演算法,傾向在資料蒐集的前期選擇較多訓練資料,並估計其概率正確可學習性(PAC),做為後期是否能提高詢問標記門檻值的標準,以達到限制模型改變次數的目的,解決標記成本隨著時間增加的困難。


    With the rapidly development of Internet of Things (IoT) and big data research, the background technology has been widely discussed. In building IoT smart environments, human activity recognition plays an important role, leading to the challenge of annotating and dealing with large-scale data. In this thesis, we use active learning as our basic framework and apply it to human activity recognition. According to different variety scenarios, several query strategies are designed. Besides, we consider that in the reality, data comes in an ordered fashion. The querying strategies are designed for streaming data to decide whether to include the newly coming instance or not.

    Due to the difficulty of labeling and training large-scale data, we propose limited-time active learning. The algorithm tends to query labels and update the model at the very beginning. As time goes by, it queries much fewer labels. Therefore, we can achieve the goal to limit the number of changes done to the model. At the same time, we also solve the difficulty of expensive labeling.

    Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.1 Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.2 Querying Strategy . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.3 Stream-Based Active Learning . . . . . . . . . . . . . . . . . . . 12 2.2 Limited-Time Active Learning . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.1 Limited-Time Active Learning with Buffer . . . . . . . . . . . . 16 vi2.2.2 Limited-Time Active Learning in Specific Time Interval . . . . . 18 2.3 Label Propagation and Semi-Supervised Learning . . . . . . . . . . . . . 20 3 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.1 SVM Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3 General Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.1 UCI Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.3 Cost Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.4 Experimetnal Setting . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.5 Result and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.4 Time-Series Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.4.1 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.5 Cost Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.5.1 Experimetnal Setting . . . . . . . . . . . . . . . . . . . . . . . . 31 3.5.2 Stream-Based Active Learning . . . . . . . . . . . . . . . . . . . 32 3.5.3 Limited-Time Active Learning . . . . . . . . . . . . . . . . . . . 33 3.5.4 Limited-Time Active Learning with Buffer . . . . . . . . . . . . 35 3.5.5 Limited-Time Active Learning in Specified Time Interval . . . . 36 4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    [1] P. Rashidi, D. J. Cook, L. B. Holder, and M. Schmitter-Edgecombe, “Discovering activities to recog-
    nize and track in a smart environment,” IEEE Trans. Knowledge and Data Engineering, vol. 23, no. 4,
    pp. 527–539, 2011.
    [2] D. Trabelsi, S. Mohammed, F. Chamroukhi, L. Oukhellou, and Y. Amirat, “An unsupervised approach
    for automatic activity recognition based on hidden markov model regression,” IEEE Trans. Automation
    Science and Engineering, vol. 10, no. 3, pp. 829–835, 2013.
    [3] D. Guan, W. Yuan, Y.-K. Lee, A. Gavrilov, and S. Lee, “Activity recognition based on semi-supervised
    learning,” in IEEE International Conf. Embedded and Real-Time Computing Systems and Applica-
    tions, pp. 469–475, IEEE, 2007.
    [4] M. Stikic, K. Van Laerhoven, and B. Schiele, “Exploring semi-supervised and active learning for
    activity recognition,” in IEEE International Symposium on Wearable Computers, pp. 81–88, IEEE,
    2008.
    [5] D. D. Lewis and W. A. Gale, “A sequential algorithm for training text classifiers,” in Proc. of ACM
    Research and Development in Information Retrieval Conf. SIGIR, pp. 3–12, Springer-Verlag New
    York, Inc., 1994.
    [6] S. Tong and D. Koller, “Support vector machine active learning with applications to text classification,”
    The Journal of Machine Learning Research, vol. 2, pp. 45–66, 2002.
    [7] I. Žliobaitė, “Learning under concept drift: an overview,” arXiv preprint arXiv:1010.4784, 2010.
    [8] X. Zhu, P. Zhang, X. Lin, and Y. Shi, “Active learning from data streams,” in IEEE International Conf.
    Data Mining, 2007. ICDM 2007, pp. 757–762, IEEE, 2007.
    [9] E. Lughofer, “Single-pass active learning with conflict and ignorance,” Evolving Systems, vol. 3, no. 4,
    pp. 251–271, 2012.
    [10] I. Žliobaitė, A. Bifet, B. Pfahringer, and G. Holmes, “Active learning with evolving streaming data,”
    in Machine Learning and Knowledge Discovery in Databases, pp. 597–612, Springer, 2011.
    [11] S. Dasgupta, A. T. Kalai, and C. Monteleoni, “Analysis of perceptron-based active learning,” The
    Journal of Machine Learning Research, vol. 10, pp. 281–299, 2009.
    [12] B. Settles, “Active learning literature survey,” University of Wisconsin, Madison, vol. 52, no. 55-66,
    p. 11, 2010.
    [13] P. Melville and R. J. Mooney, “Diverse ensembles for active learning,” in Proc. International Confer-
    ence on Machine Learning, p. 74, ACM, 2004.
    [14] B. S. M. Craven and S. Ray, “Multiple-instance active learning,”
    39[15] N. Roy and A. McCallum, “Toward optimal active learning through monte carlo estimation of error
    reduction,” Proc. 18th International Conference on Machine Learning, Williamstown, pp. 441–448,
    2001.
    [16] L. G. Valiant, “A theory of the learnable,” ACM Communications, vol. 27, no. 11, pp. 1134–1142,
    1984.
    [17] D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf, “Learning with local and global con-
    sistency,” Advances in neural information processing systems, vol. 16, no. 16, pp. 321–328, 2004.
    [18] O. Chapelle, J. Weston, and B. Schölkopf, “Cluster kernels for semi-supervised learning,” in Advances
    in neural information processing systems, pp. 585–592, 2002.
    [19] X. Zhu, “Semi-supervised learning literature survey,” 2005.
    [20] R. Mihalcea, “Co-training and self-training for word sense disambiguation,” in Proceedings of the
    Conference on Computational Natural Language Learning (CoNLL-2004), 2004.
    [21] B. Maeireizo, D. Litman, and R. Hwa, “Co-training for predicting emotions with spoken dialogue
    data,” in Proceedings of the ACL 2004 on Interactive poster and demonstration sessions, p. 28, Asso-
    ciation for Computational Linguistics, 2004.
    [22] G. R. Haffari and A. Sarkar, “Analysis of semi-supervised learning with the yarowsky algorithm,”
    arXiv preprint arXiv:1206.5240, 2012.
    [23] M. Culp and G. Michailidis, “An iterative algorithm for extending learners to a semi-supervised set-
    ting,” Journal of Computational and Graphical Statistics, vol. 17, no. 3, pp. 545–571, 2008.
    [24] A. Blum and T. Mitchell, “Combining labeled and unlabeled data with co-training,” pp. 92–100, 1998.
    [25] M.-A. Krogel and T. Scheffer, “Multi-relational learning, text mining, and semi-supervised learning
    for functional genomics,” Machine Learning, vol. 57, no. 1-2, pp. 61–81, 2004.
    [26] S. Abney, Semisupervised learning for computational linguistics. CRC Press, 2007.
    [27] X. Zhu and Z. Ghahramani, “Learning from labeled and unlabeled data with label propagation,” tech.
    rep., Citeseer, 2002.
    [28] F. Wang and C. Zhang, “Label propagation through linear neighborhoods,” IEEE Tran. Knowledge
    and Data Engineering, vol. 20, no. 1, pp. 55–67, 2008.
    [29] O. L. Mangasarian, W. N. Street, and W. H. Wolberg, “Breast cancer diagnosis and prognosis via linear
    programming,” Operations Research, vol. 43, no. 4, pp. 570–577, 1995.
    [30] T. Huynh, M. Fritz, and B. Schiele, “Discovery of activity patterns using topic models,” in Proc. 10th
    International Conference on Ubiquitous Computing, pp. 10–19, ACM, 2008.
    [31] D. Tuia, M. Volpi, L. Copa, M. Kanevski, and J. Muñoz-Marí, “A survey of active learning algorithms
    for supervised remote sensing image classification,” IEEE J. Selected Topics in Signal Processing,
    vol. 5, no. 3, pp. 606–617, 2011.

    無法下載圖示 全文公開日期 2018/08/27 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE