簡易檢索 / 詳目顯示

研究生: MARIA CAROLINA NOVITASARI
MARIA CAROLINA NOVITASARI
論文名稱: Incorporating Periodicity Analysis in Active Learning for Multivariate Time Series Classification
Incorporating Periodicity Analysis in Active Learning for Multivariate Time Series Classification
指導教授: 鮑興國
Hsing-Kuo Pao
口試委員: 鮑興國
Hsing-Kuo Pao
戴碧如
Bi-Ru Dai
項天瑞
Tien-Ruey Hsiang
孫敏德
Min-Te Sun
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 78
中文關鍵詞: multivariate time seriesactive learningperiodicity analysisclassification
外文關鍵詞: multivariate time series, active learning, periodicity analysis, classification
相關次數: 點閱:306下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • To classify time series data with the traditional way, we need a huge labeled data as the training phase. In reality, the number of labeled data is often smaller and there is a huge number of unlabeled data. At that point, we will manually label those unlabeled data. However, it is time-consuming and expensive. Hence, we do active learning to efficiently querying the data and minimizing the total cost of labeling. Meanwhile, when we observe time series data, we might see a periodic pattern there. As we know that periodicity is one of the general aspects of time series data. By discovering potential periods in time series data, we can get more important information about the data. We might be able to label the data by knowing this information. Here we aim to provide an approach which is able to analyze periodicity in time series data to help active learning to do a better job and achieve higher accuracy using less number of labeled data. Through periodicity analysis, our approach can extract temporal features, select the best unlabeled instances, and label the unlabeled instances. Our approach presents the algorithm to analyze the periodic pattern for handling multi-class time series sequence and provide the state of the art results when combine it with active learning.


    To classify time series data with the traditional way, we need a huge labeled data as the training phase. In reality, the number of labeled data is often smaller and there is a huge number of unlabeled data. At that point, we will manually label those unlabeled data. However, it is time-consuming and expensive. Hence, we do active learning to efficiently querying the data and minimizing the total cost of labeling. Meanwhile, when we observe time series data, we might see a periodic pattern there. As we know that periodicity is one of the general aspects of time series data. By discovering potential periods in time series data, we can get more important information about the data. We might be able to label the data by knowing this information. Here we aim to provide an approach which is able to analyze periodicity in time series data to help active learning to do a better job and achieve higher accuracy using less number of labeled data. Through periodicity analysis, our approach can extract temporal features, select the best unlabeled instances, and label the unlabeled instances. Our approach presents the algorithm to analyze the periodic pattern for handling multi-class time series sequence and provide the state of the art results when combine it with active learning.

    Table of contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Table of contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1 Periodicity Analysis . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 General Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1 Support Vector Machine (SVM) . . . . . . . . . . . . . . . . . . . . . 8 2.2 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.1 Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.2 Query Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 Periodicity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.1 Discrete Fourier Transform . . . . . . . . . . . . . . . . . . . . 16 2.3.2 Fast Fourier Transform . . . . . . . . . . . . . . . . . . . . . . 18 2.3.3 Power Spectral Density Estimation . . . . . . . . . . . . . . . 19 3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1 Research Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.1 Periods Detection . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2 Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.1 Non-Dynamic Data . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.2 Dynamic Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.3 Combining Informativeness, Representativeness, and Diversity 25 3.2.4 Considering Informativeness, Representativeness, Diversity, and Periodicity Confidence as Query Strategy Integrated with Label Propagation through Periodicity Analysis Approach . . . . 27 3.3 Labeling Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.4 Model Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.4 Periodicity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.4.1 Synthetic Data with Multiple Periods . . . . . . . . . . . . . . 36 4.4.2 Real Case Data . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.5 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.5.1 Experimental Settings . . . . . . . . . . . . . . . . . . . . . . 43 4.5.2 Non-Dynamic Data . . . . . . . . . . . . . . . . . . . . . . . . 45 4.5.3 Dynamic Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.5.4 Combining Informativeness, Representativeness, and Diversity 50 4.5.5 Considering Informativeness, Representativeness, Diversity, and Periodicity Confidence as Query Strategy Integrated with La- bel Propagation through Periodicity Analysis Approach . . . . 53 4.6 Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    [1] G. Schohn and D. Cohn, “Less is more: Active learning with support vector
    machines,” in
    ICML
    , pp. 839–846, Citeseer, 2000.
    [2] S. Tong and D. Koller, “Support vector machine active learning with appli-
    cations to text classification,”
    Journal of machine learning research
    , vol. 2,
    no. Nov, pp. 45–66, 2001.
    [3] X. Li and Y. Guo, “Adaptive active learning for image classification,” in
    Pro-
    ceedings of the IEEE Conference on Computer Vision and Pattern Recognition
    ,
    pp. 859–866, 2013.
    [4] A. J. Joshi, F. Porikli, and N. Papanikolopoulos, “Multi-class active learning
    for image classification,” in
    Computer Vision and Pattern Recognition, 2009.
    CVPR 2009. IEEE Conference on
    , pp. 2372–2379, IEEE, 2009.
    [5] H. Guo and W. Wang, “An active learning-based svm multi-class classification
    model,”
    Pattern recognition
    , vol. 48, no. 5, pp. 1577–1597, 2015.
    [6] D. Hakkani-T ̈ur, G. Riccardi, and A. Gorin, “Active learning for automatic
    speech recognition,” in
    Acoustics, Speech, and Signal Processing (ICASSP),
    2002 IEEE International Conference on
    , vol. 4, pp. IV–3904, IEEE, 2002.
    [7] C. A. Thompson, M. E. Califf, and R. J. Mooney, “Active learning for natural
    language parsing and information extraction,” in
    ICML
    , pp. 406–414, Citeseer,
    1999.
    [8] A. Finn and N. Kushmerick, “Active learning selection strategies for informa-
    tion extraction,” in
    Proceedings of the International Workshop on Adaptive Text
    Extraction and Mining (ATEM-03)
    , pp. 18–25, 2003.
    [9] M. M. Haque, L. B. Holder, M. K. Skinner, and D. J. Cook, “Generalized
    query-based active learning to identify differentially methylated regions in
    dna,”
    IEEE/ACM Transactions on Computational Biology and Bioinformat-
    ics
    , vol. 10, no. 3, pp. 632–644, 2013.
    64
    [10] D. D. Lewis and W. A. Gale, “A sequential algorithm for training text clas-
    sifiers,” in
    Proceedings of the 17th annual international ACM SIGIR confer-
    ence on Research and development in information retrieval
    , pp. 3–12, Springer-
    Verlag New York, Inc., 1994.
    [11] M. G. Elfeky, W. G. Aref, and A. K. Elmagarmid, “Periodicity detection in time
    series databases,”
    IEEE Transactions on Knowledge and Data Engineering
    ,
    vol. 17, no. 7, pp. 875–887, 2005.
    [12] M. Vlachos, P. Yu, and V. Castelli, “On periodicity detection and structural
    periodic similarity,” in
    Proceedings of the 2005 SIAM International Conference
    on Data Mining
    , pp. 449–460, SIAM, 2005.
    [13] R. Agrawal and R. Srikant, “Mining sequential patterns,” in
    Data Engineering,
    1995. Proceedings of the Eleventh International Conference on
    , pp. 3–14, IEEE,
    1995.
    [14] C. Bettini, X. S. Wang, S. Jajodia, and J.-L. Lin, “Discovering frequent event
    patterns with multiple granularities in time sequences,”
    IEEE Transactions on
    Knowledge and Data Engineering
    , vol. 10, no. 2, pp. 222–237, 1998.
    [15] W. G. Aref, M. G. Elfeky, and A. K. Elmagarmid, “Incremental, online, and
    merge mining of partial periodic patterns in time-series databases,”
    IEEE
    Transactions on Knowledge and Data Engineering
    , vol. 16, no. 3, pp. 332–342,
    2004.
    [16] M. G. Elfeky, W. G. Aref, and A. K. Elmagarmid, “Warp: time warping for
    periodicity detection,” in
    Data Mining, Fifth IEEE International Conference
    on
    , pp. 8–pp, IEEE, 2005.
    [17] F. Rasheed and R. Alhajj, “Stnr: A suffix tree based noise resilient algorithm
    for periodicity detection in time series databases,”
    Applied Intelligence
    , vol. 32,
    no. 3, pp. 267–278, 2010.
    [18] C. Runge and H. Konig, “Die grundlehren der mathematischen wissenschaften,”
    Vorlesungen ̈uber numerisches Rechnen
    , vol. 11, 1924.
    65
    [19] A. Prey, “Grundlagen und methoden der periodenforschung,”
    Monatshefte f ̈ur
    Mathematik
    , vol. 46, no. 1, pp. A12–A13, 1937.
    [20] P. Welch, “The use of fast fourier transform for the estimation of power spectra:
    a method based on time averaging over short, modified periodograms,”
    IEEE
    Transactions on audio and electroacoustics
    , vol. 15, no. 2, pp. 70–73, 1967.
    [21] Z. Li, B. Ding, J. Han, R. Kays, and P. Nye, “Mining periodic behaviors for
    moving objects,” in
    Proceedings of the 16th ACM SIGKDD international con-
    ference on Knowledge discovery and data mining
    , pp. 1099–1108, ACM, 2010.
    [22] B. Du, Z. Wang, L. Zhang, L. Zhang, W. Liu, J. Shen, and D. Tao, “Exploring
    representativeness and informativeness for active learning,”
    IEEE transactions
    on cybernetics
    , vol. 47, no. 1, pp. 14–26, 2017.
    [23] S.-J. Huang, R. Jin, and Z.-H. Zhou, “Active learning by querying informative
    and representative examples,” in
    Advances in neural information processing
    systems
    , pp. 892–900, 2010.
    [24] Y. Yang, Z. Ma, F. Nie, X. Chang, and A. G. Hauptmann, “Multi-class active
    learning by uncertainty sampling with diversity maximization,”
    International
    Journal of Computer Vision
    , vol. 113, no. 2, pp. 113–127, 2015.
    [25] T. He, S. Zhang, J. Xin, P. Zhao, J. Wu, X. Xian, C. Li, and Z. Cui, “An
    active learning approach with uncertainty, representativeness, and diversity,”
    The Scientific World Journal
    , vol. 2014, 2014.
    [26] V. N. Vapnik and V. Vapnik,
    Statistical learning theory
    , vol. 1. Wiley New
    York, 1998.
    [27] D. Angluin, “Queries and concept learning,”
    Machine learning
    , vol. 2, no. 4,
    pp. 319–342, 1988.
    [28] D. Angluin, “Queries revisited,” in
    International Conference on Algorithmic
    Learning Theory
    , pp. 12–31, Springer, 2001.
    [29] B. Settles, “Active learning,”
    Synthesis Lectures on Artificial Intelligence and
    Machine Learning
    , vol. 6, no. 1, pp. 1–114, 2012.
    66
    [30] B. Settles and M. Craven, “An analysis of active learning strategies for sequence
    labeling tasks,” in
    Proceedings of the conference on empirical methods in natural
    language processing
    , pp. 1070–1079, Association for Computational Linguistics,
    2008.
    [31] K. Brinker, “Incorporating diversity in active learning with support vector ma-
    chines,” in
    ICML
    , vol. 3, pp. 59–66, 2003.
    [32] “Aruba dataset from wsu casas smart home project.” Data retrieved from WSU
    CASAS Datasets,
    http://ailab.wsu.edu/casas/datasets.html
    .
    [33] C.-W. Hsu and C.-J. Lin, “A simple decomposition method for support vector
    machines,”
    Machine Learning
    , vol. 46, no. 1, pp. 291–314, 2002.
    [34] C.-C. Chang and C.-J. Lin, “Libsvm: a library for support vector machines,”
    ACM Transactions on Intelligent Systems and Technology (TIST)
    , vol. 2, no. 3,
    p. 27, 2011.
    [35] U.-G. KreBel, “Pairwise classification and support vector machines,”
    Advances
    in Kernel Mathods
    , 1999.
    [36] J. Platt
    et al.
    , “Probabilistic outputs for support vector machines and compar-
    isons to regularized likelihood methods,”
    Advances in large margin classifiers
    ,
    vol. 10, no. 3, pp. 61–74, 1999.
    [37] H.-T. Lin, C.-J. Lin, and R. C. Weng, “A note on platt

    s probabilistic outputs
    for support vector machines,”
    Machine learning
    , vol. 68, no. 3, pp. 267–276,
    2007.
    [38] B. Caputo, K. Sim, F. Furesjo, and A. Smola, “Appearance-based object recog-
    nition using svms: which kernel should i use?,” in
    Proc of NIPS workshop on
    Statistical methods for computational experiments in visual processing and com-
    puter vision, Whistler
    , vol. 2002, 2002.

    QR CODE