簡易檢索 / 詳目顯示

研究生: MARIA CAROLINA NOVITASARI
MARIA CAROLINA NOVITASARI
論文名稱: Incorporating Periodicity Analysis in Active Learning for Multivariate Time Series Classification
Incorporating Periodicity Analysis in Active Learning for Multivariate Time Series Classification
指導教授: 鮑興國
Hsing-Kuo Pao
口試委員: 鮑興國
Hsing-Kuo Pao
戴碧如
Bi-Ru Dai
項天瑞
Tien-Ruey Hsiang
孫敏德
Min-Te Sun
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 78
中文關鍵詞: multivariate time seriesactive learningperiodicity analysisclassification
外文關鍵詞: multivariate time series, active learning, periodicity analysis, classification
相關次數: 點閱:319下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

To classify time series data with the traditional way, we need a huge labeled data as the training phase. In reality, the number of labeled data is often smaller and there is a huge number of unlabeled data. At that point, we will manually label those unlabeled data. However, it is time-consuming and expensive. Hence, we do active learning to efficiently querying the data and minimizing the total cost of labeling. Meanwhile, when we observe time series data, we might see a periodic pattern there. As we know that periodicity is one of the general aspects of time series data. By discovering potential periods in time series data, we can get more important information about the data. We might be able to label the data by knowing this information. Here we aim to provide an approach which is able to analyze periodicity in time series data to help active learning to do a better job and achieve higher accuracy using less number of labeled data. Through periodicity analysis, our approach can extract temporal features, select the best unlabeled instances, and label the unlabeled instances. Our approach presents the algorithm to analyze the periodic pattern for handling multi-class time series sequence and provide the state of the art results when combine it with active learning.


To classify time series data with the traditional way, we need a huge labeled data as the training phase. In reality, the number of labeled data is often smaller and there is a huge number of unlabeled data. At that point, we will manually label those unlabeled data. However, it is time-consuming and expensive. Hence, we do active learning to efficiently querying the data and minimizing the total cost of labeling. Meanwhile, when we observe time series data, we might see a periodic pattern there. As we know that periodicity is one of the general aspects of time series data. By discovering potential periods in time series data, we can get more important information about the data. We might be able to label the data by knowing this information. Here we aim to provide an approach which is able to analyze periodicity in time series data to help active learning to do a better job and achieve higher accuracy using less number of labeled data. Through periodicity analysis, our approach can extract temporal features, select the best unlabeled instances, and label the unlabeled instances. Our approach presents the algorithm to analyze the periodic pattern for handling multi-class time series sequence and provide the state of the art results when combine it with active learning.

Table of contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Table of contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1 Periodicity Analysis . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 General Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1 Support Vector Machine (SVM) . . . . . . . . . . . . . . . . . . . . . 8 2.2 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.1 Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.2 Query Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 Periodicity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.1 Discrete Fourier Transform . . . . . . . . . . . . . . . . . . . . 16 2.3.2 Fast Fourier Transform . . . . . . . . . . . . . . . . . . . . . . 18 2.3.3 Power Spectral Density Estimation . . . . . . . . . . . . . . . 19 3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1 Research Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.1 Periods Detection . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2 Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.1 Non-Dynamic Data . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.2 Dynamic Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.3 Combining Informativeness, Representativeness, and Diversity 25 3.2.4 Considering Informativeness, Representativeness, Diversity, and Periodicity Confidence as Query Strategy Integrated with Label Propagation through Periodicity Analysis Approach . . . . 27 3.3 Labeling Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.4 Model Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.4 Periodicity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.4.1 Synthetic Data with Multiple Periods . . . . . . . . . . . . . . 36 4.4.2 Real Case Data . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.5 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.5.1 Experimental Settings . . . . . . . . . . . . . . . . . . . . . . 43 4.5.2 Non-Dynamic Data . . . . . . . . . . . . . . . . . . . . . . . . 45 4.5.3 Dynamic Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.5.4 Combining Informativeness, Representativeness, and Diversity 50 4.5.5 Considering Informativeness, Representativeness, Diversity, and Periodicity Confidence as Query Strategy Integrated with La- bel Propagation through Periodicity Analysis Approach . . . . 53 4.6 Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

[1] G. Schohn and D. Cohn, “Less is more: Active learning with support vector
machines,” in
ICML
, pp. 839–846, Citeseer, 2000.
[2] S. Tong and D. Koller, “Support vector machine active learning with appli-
cations to text classification,”
Journal of machine learning research
, vol. 2,
no. Nov, pp. 45–66, 2001.
[3] X. Li and Y. Guo, “Adaptive active learning for image classification,” in
Pro-
ceedings of the IEEE Conference on Computer Vision and Pattern Recognition
,
pp. 859–866, 2013.
[4] A. J. Joshi, F. Porikli, and N. Papanikolopoulos, “Multi-class active learning
for image classification,” in
Computer Vision and Pattern Recognition, 2009.
CVPR 2009. IEEE Conference on
, pp. 2372–2379, IEEE, 2009.
[5] H. Guo and W. Wang, “An active learning-based svm multi-class classification
model,”
Pattern recognition
, vol. 48, no. 5, pp. 1577–1597, 2015.
[6] D. Hakkani-T ̈ur, G. Riccardi, and A. Gorin, “Active learning for automatic
speech recognition,” in
Acoustics, Speech, and Signal Processing (ICASSP),
2002 IEEE International Conference on
, vol. 4, pp. IV–3904, IEEE, 2002.
[7] C. A. Thompson, M. E. Califf, and R. J. Mooney, “Active learning for natural
language parsing and information extraction,” in
ICML
, pp. 406–414, Citeseer,
1999.
[8] A. Finn and N. Kushmerick, “Active learning selection strategies for informa-
tion extraction,” in
Proceedings of the International Workshop on Adaptive Text
Extraction and Mining (ATEM-03)
, pp. 18–25, 2003.
[9] M. M. Haque, L. B. Holder, M. K. Skinner, and D. J. Cook, “Generalized
query-based active learning to identify differentially methylated regions in
dna,”
IEEE/ACM Transactions on Computational Biology and Bioinformat-
ics
, vol. 10, no. 3, pp. 632–644, 2013.
64
[10] D. D. Lewis and W. A. Gale, “A sequential algorithm for training text clas-
sifiers,” in
Proceedings of the 17th annual international ACM SIGIR confer-
ence on Research and development in information retrieval
, pp. 3–12, Springer-
Verlag New York, Inc., 1994.
[11] M. G. Elfeky, W. G. Aref, and A. K. Elmagarmid, “Periodicity detection in time
series databases,”
IEEE Transactions on Knowledge and Data Engineering
,
vol. 17, no. 7, pp. 875–887, 2005.
[12] M. Vlachos, P. Yu, and V. Castelli, “On periodicity detection and structural
periodic similarity,” in
Proceedings of the 2005 SIAM International Conference
on Data Mining
, pp. 449–460, SIAM, 2005.
[13] R. Agrawal and R. Srikant, “Mining sequential patterns,” in
Data Engineering,
1995. Proceedings of the Eleventh International Conference on
, pp. 3–14, IEEE,
1995.
[14] C. Bettini, X. S. Wang, S. Jajodia, and J.-L. Lin, “Discovering frequent event
patterns with multiple granularities in time sequences,”
IEEE Transactions on
Knowledge and Data Engineering
, vol. 10, no. 2, pp. 222–237, 1998.
[15] W. G. Aref, M. G. Elfeky, and A. K. Elmagarmid, “Incremental, online, and
merge mining of partial periodic patterns in time-series databases,”
IEEE
Transactions on Knowledge and Data Engineering
, vol. 16, no. 3, pp. 332–342,
2004.
[16] M. G. Elfeky, W. G. Aref, and A. K. Elmagarmid, “Warp: time warping for
periodicity detection,” in
Data Mining, Fifth IEEE International Conference
on
, pp. 8–pp, IEEE, 2005.
[17] F. Rasheed and R. Alhajj, “Stnr: A suffix tree based noise resilient algorithm
for periodicity detection in time series databases,”
Applied Intelligence
, vol. 32,
no. 3, pp. 267–278, 2010.
[18] C. Runge and H. Konig, “Die grundlehren der mathematischen wissenschaften,”
Vorlesungen ̈uber numerisches Rechnen
, vol. 11, 1924.
65
[19] A. Prey, “Grundlagen und methoden der periodenforschung,”
Monatshefte f ̈ur
Mathematik
, vol. 46, no. 1, pp. A12–A13, 1937.
[20] P. Welch, “The use of fast fourier transform for the estimation of power spectra:
a method based on time averaging over short, modified periodograms,”
IEEE
Transactions on audio and electroacoustics
, vol. 15, no. 2, pp. 70–73, 1967.
[21] Z. Li, B. Ding, J. Han, R. Kays, and P. Nye, “Mining periodic behaviors for
moving objects,” in
Proceedings of the 16th ACM SIGKDD international con-
ference on Knowledge discovery and data mining
, pp. 1099–1108, ACM, 2010.
[22] B. Du, Z. Wang, L. Zhang, L. Zhang, W. Liu, J. Shen, and D. Tao, “Exploring
representativeness and informativeness for active learning,”
IEEE transactions
on cybernetics
, vol. 47, no. 1, pp. 14–26, 2017.
[23] S.-J. Huang, R. Jin, and Z.-H. Zhou, “Active learning by querying informative
and representative examples,” in
Advances in neural information processing
systems
, pp. 892–900, 2010.
[24] Y. Yang, Z. Ma, F. Nie, X. Chang, and A. G. Hauptmann, “Multi-class active
learning by uncertainty sampling with diversity maximization,”
International
Journal of Computer Vision
, vol. 113, no. 2, pp. 113–127, 2015.
[25] T. He, S. Zhang, J. Xin, P. Zhao, J. Wu, X. Xian, C. Li, and Z. Cui, “An
active learning approach with uncertainty, representativeness, and diversity,”
The Scientific World Journal
, vol. 2014, 2014.
[26] V. N. Vapnik and V. Vapnik,
Statistical learning theory
, vol. 1. Wiley New
York, 1998.
[27] D. Angluin, “Queries and concept learning,”
Machine learning
, vol. 2, no. 4,
pp. 319–342, 1988.
[28] D. Angluin, “Queries revisited,” in
International Conference on Algorithmic
Learning Theory
, pp. 12–31, Springer, 2001.
[29] B. Settles, “Active learning,”
Synthesis Lectures on Artificial Intelligence and
Machine Learning
, vol. 6, no. 1, pp. 1–114, 2012.
66
[30] B. Settles and M. Craven, “An analysis of active learning strategies for sequence
labeling tasks,” in
Proceedings of the conference on empirical methods in natural
language processing
, pp. 1070–1079, Association for Computational Linguistics,
2008.
[31] K. Brinker, “Incorporating diversity in active learning with support vector ma-
chines,” in
ICML
, vol. 3, pp. 59–66, 2003.
[32] “Aruba dataset from wsu casas smart home project.” Data retrieved from WSU
CASAS Datasets,
http://ailab.wsu.edu/casas/datasets.html
.
[33] C.-W. Hsu and C.-J. Lin, “A simple decomposition method for support vector
machines,”
Machine Learning
, vol. 46, no. 1, pp. 291–314, 2002.
[34] C.-C. Chang and C.-J. Lin, “Libsvm: a library for support vector machines,”
ACM Transactions on Intelligent Systems and Technology (TIST)
, vol. 2, no. 3,
p. 27, 2011.
[35] U.-G. KreBel, “Pairwise classification and support vector machines,”
Advances
in Kernel Mathods
, 1999.
[36] J. Platt
et al.
, “Probabilistic outputs for support vector machines and compar-
isons to regularized likelihood methods,”
Advances in large margin classifiers
,
vol. 10, no. 3, pp. 61–74, 1999.
[37] H.-T. Lin, C.-J. Lin, and R. C. Weng, “A note on platt

s probabilistic outputs
for support vector machines,”
Machine learning
, vol. 68, no. 3, pp. 267–276,
2007.
[38] B. Caputo, K. Sim, F. Furesjo, and A. Smola, “Appearance-based object recog-
nition using svms: which kernel should i use?,” in
Proc of NIPS workshop on
Statistical methods for computational experiments in visual processing and com-
puter vision, Whistler
, vol. 2002, 2002.

QR CODE