簡易檢索 / 詳目顯示

研究生: 張國清
Frederik Darwin
論文名稱: 運用於時間序列不規則搜尋之KL散度區域遞迴演算法
Local Recurrence Rate based Discord Search Algorithm with Kullback-Leibler Divergence (LRRDS-KL) for Time Series Anomaly Search
指導教授: 楊朝龍
Chao-Lung Yang
口試委員: 歐陽超
Chao Ou-Yang
林柏廷
Po Ting Lin
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 50
中文關鍵詞: Multivariate time seriesDiscord searchRecurrence structureLocal recurrence ratesStreaming datasetKL Divergence
外文關鍵詞: Multivariate time series, Discord search, Recurrence structure, Local recurrence rates, Streaming dataset, KL Divergence
相關次數: 點閱:248下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

Discord search in time series data plays an important role in analyzing data collected from the manufacturing site. By identifying anomalous subsequences of time series data collected from the manufacturing machines, the issue of malfunction based on the evaluation of operating conditions can be detected. Recently, a method called Local Recurrence Rate based Discord Search (LRRDS) was proposed to convert time series data into Recurrence Plot (RP) for further discord investigation. Although LRRDS shows promising performance, it could be further improved. Moreover, its capability in the streaming environment also has not been tested. In this research, an improved computational framework for discord search under LRRDS is proposed. First, the time series data is transformed into an RP and the Kullback-Leibler divergence is used to provide more accurate distance measure to its RP formulation. Second, the subsequence characteristic measurement is improved by adding integral as a statistical feature. Then, each corresponding subsequence can be compared with other non-overlapping neighbor subsequences. Subsequence with the largest value of dissimilarity is considered as a discord by the algorithm. The capability of the improved LRRDS and the original LRRDS were compared on various multivariate time series datasets, and also in a simulated streaming environment. The results show that the proposed approach can detect discords and it works fine under a simulated streaming environment.


Discord search in time series data plays an important role in analyzing data collected from the manufacturing site. By identifying anomalous subsequences of time series data collected from the manufacturing machines, the issue of malfunction based on the evaluation of operating conditions can be detected. Recently, a method called Local Recurrence Rate based Discord Search (LRRDS) was proposed to convert time series data into Recurrence Plot (RP) for further discord investigation. Although LRRDS shows promising performance, it could be further improved. Moreover, its capability in the streaming environment also has not been tested. In this research, an improved computational framework for discord search under LRRDS is proposed. First, the time series data is transformed into an RP and the Kullback-Leibler divergence is used to provide more accurate distance measure to its RP formulation. Second, the subsequence characteristic measurement is improved by adding integral as a statistical feature. Then, each corresponding subsequence can be compared with other non-overlapping neighbor subsequences. Subsequence with the largest value of dissimilarity is considered as a discord by the algorithm. The capability of the improved LRRDS and the original LRRDS were compared on various multivariate time series datasets, and also in a simulated streaming environment. The results show that the proposed approach can detect discords and it works fine under a simulated streaming environment.

ABSTRACT ii TABLE OF CONTENTS iii LIST OF FIGURE v LIST OF TABLE vi CHAPTER 1 1 CHAPTER 2 3 2.1 Recurrence plot and recurrence quantification analysis 3 2.2 Local recurrence rate based discord search (LRRDS) 3 2.2.1 Data normalization 4 2.2.2 Phase Space Reconstruction 4 2.2.3 Recurrence Plot (RP) 5 2.2.4 Local Recurrence Rate (LREC) 6 2.2.5 Detecting discords 6 2.3 Parameter Optimization and Acceleration 7 2.3.1 Piecewise Aggregate Approximation (PAA) 7 2.3.2 Slack Margin 8 2.3.3 Parameter Optimization 9 CHAPTER 3 10 3.1 Data Preprocessing 11 3.1.1 Data Normalization 11 3.1.2 PAA Compression 12 3.2 RP Representation 12 3.3 LREC and Change Point Detection 13 3.3.1 LREC 14 3.3.2 Change Point Detection 14 3.4 Discord Detection 15 3.4.1 Second RP representation 16 3.5 Streaming Dataset Preparation 18 3.5.1 Simulating Streaming Time Series Dataset 19 3.5.2 LRRDS with Streaming Dataset Method 19 CHAPTER 4 21 4.1 Improving the Original LRRDS 21 4.1.1 Kullback-Leibler Divergence 21 4.1.2 Integral for Subsequences Characteristic Measure 22 4.2 Original LRRDS vs. LRRDS-KL 23 4.2.1 Discord Detection Accuracy 23 4.2.2 Computation Time 28 4.3 Capability on Streaming Environment Data 29 4.3.1 LRRDS-KL on Streaming Dataset 29 4.3.2 Computation Time 32 CHAPTER 5 33 REFERENCES 35 APPENDIX 37 APPENDIX A. Confusion Matrix Table (All Parameter) 37 APPENDIX B. Best Parameter and Confusion Matrix for Each Dataset. 39 APPENDIX C. Computation Time of LRRDS vs LRRDS-KL. 40 APPENDIX D. LRRDS-KL Graph on Simulated Streaming Environment Dataset (cont.) 41

[1] J. Lee, B. Bagheri, and H.-A. Kao, "A cyber-physical systems architecture for industry 4.0-based manufacturing systems," Manufacturing Letters, vol. 3, pp. 18-23, 2015.
[2] C.-L. Yang et al., "Streaming data analysis framework for cyber-physical system of metal machining processes," presented at the 2018 IEEE Industrial Cyber-Physical Systems (ICPS), St. Petersburg, Russia, 2018.
[3] D. Pandit, L. Zhang, N. Aslam, C. Liu, and S. Chattopadhyay, "Improved abnormality detection from raw ECG signals using feature enhancement," in 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), 2016, pp. 1402-1406.
[4] H. M. Rai, A. Trivedi, and S. Shukla, "ECG signal processing for abnormalities detection using multi-resolution wavelet transform and Artificial Neural Network classifier," Measurement, vol. 46, no. 9, pp. 3238-3246, 2013/11/01/ 2013.
[5] J. Jun and Y. Peng, "A Study on Improving the Abnormal Signal Detection Ability of Digital Storage Oscilloscope," in 2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing, 2013, pp. 244-247.
[6] A. Apostolico, M. E. Bock, and S. Lonardi, "Monotony of surprise and large-scale quest for unusual words," presented at the Proceedings of the sixth annual international conference on Computational biology, Washington, DC, USA, 2002.
[7] G. D. Stormo, "Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Richard Durbin , Sean R. Eddy , Anders Krogh , Graeme Mitchison," The Quarterly Review of Biology, vol. 75, no. 3, pp. 313-314, 2000.
[8] A. Gionis and H. Mannila, "Finding recurrent sources in sequences," presented at the Proceedings of the seventh annual international conference on Research in computational molecular biology, Berlin, Germany, 2003.
[9] T. L. Bailey, "Discovering Sequence Motifs," in Bioinformatics: Data, Sequence Analysis and Evolution, J. M. Keith, Ed. Totowa, NJ: Humana Press, 2008, pp. 231-251.
[10] J. Buhler and M. Tompa, "Finding motifs using random projections," presented at the Proceedings of the fifth annual international conference on Computational biology, Montreal, Quebec, Canada, 2001.
[11] E. Keogh, J. Lin, and A. Fu, "HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence," presented at the Proceedings of the Fifth IEEE International Conference on Data Mining, 2005.
[12] J. Lin, E. Keogh, S. Lonardi, and B. Chiu, "A symbolic representation of time series, with implications for streaming algorithms," presented at the Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, San Diego, California, 2003.
[13] C. Yang et al., "Streaming data analysis framework for cyber-physical system of metal machining processes," in 2018 IEEE Industrial Cyber-Physical Systems (ICPS), 2018, pp. 546-551.
[14] M. Hu, X. Feng, Z. Ji, K. Yan, and S. Zhou, "A novel computational approach for discord search with local recurrence rates in multivariate time series," Information Sciences, vol. 477, pp. 220-233, 2019.
[15] Y. Zou, M. Thiel, M. C. Romano, and J. Kurths, "Analytical description of recurrence plots of dynamical systems with nontrivial recurrences," International Journal of Bifurcation and Chaos, vol. 17, pp. 4273-4283, 2007.
[16] J.-P. Eckmann, S. O. Kamphrost, and D. Ruelle, "Recurrence plots of dynamic systems," Europhys. Lett., vol. 4, no. 9, pp. 973-977, 1987.
[17] J. B. Gao, Y. Cao, L. Gu, J. G. Harris, and J. C. Principe, "Detection of weak transitions in signal dynamics using recurrence time statistics," Phys. Lett. A,, vol. 317, pp. 64-72, 2003.
[18] C. L. Webber and J. P. Zbilut, "Dynamical assessment of physiological systems and states using recurrence plot strategies," Journal of Applied Physiology, vol. 76, pp. 965-973, 1994.
[19] J. M. Nichols, S. T. Trickey, and M. Seaver, "Damage detection using multivariate recurrence quantification analysis," Mechanical Systems and Signal Processing, vol. 20, no. 2, pp. 421-437, 2006.
[20] W. Luo, M. Gallagher, and J. Wiles, "Parameter-free search of time-series discord," Journal of Computer Science and Technology, vol. 28, no. 2, pp. 300-310, 2013.
[21] R. G. Huffaker, "Phase space reconstruction from time series data: Where history meets theory," presented at the International European Forum on System Dynamics and Innovation in Food Networks, Innsbruck-Igls, Austria, 2010.
[22] J. S. Iwanski and E. Bradley, "Recurrence plots of experimental data: to embed or not to embed?," Chaos, vol. 8, pp. 861-871, 1998.
[23] J. Theiler, "Spurious dimension from correlation algorithms applied to limited time-series data," Phys. Rev. A, vol. 34, pp. 2427-2432, 1986.
[24] H. Li, "Piecewise aggregate representations and lower-bound distance functions for multivariate time series," Physica A: Statistical Mechanics and its Applications, vol. 427, pp. 10-25, 2015/06/01/ 2015.
[25] R. V. Donner, Y. Zou, J. F. Donges, N. Marwan, and J. Kurths, "Recurrence networks—a novel paradigm for nonlinear time series analysis," New Journal of Physics, vol. 12, no. 3, p. 033025, 2010/03/15 2010.

QR CODE