簡易檢索 / 詳目顯示

研究生: Mohammad Iqbal
Mohammad Iqbal
論文名稱: 辨別主動式學習與序列資料分類
Distinguishing Active Learning and Classification on Sequential Data
指導教授: 鮑興國
Hsing-Kuo Pao
口試委員: 李育杰
Yuh-Jye Lee
項天瑞
Tien-Ruey Hsiang
鄧惟中
Wei-Chung Teng
楊傳凱
Chuan-Kai Yang
孫敏德
Min-Te Sun
學位類別: 博士
Doctor
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 84
外文關鍵詞: Distinguishing Sub-sequence, Minimum Length Sequence, Sequential Data, Sequence Classification
相關次數: 點閱:171下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報


Ubiquitous sensor devices integrated the Internet-of-Things technology allow us to have a bunch of sequential data representing real-world problems, such as trip destination forecasting, predictive maintenance, and other sequence classification tasks. This study focused on the sequential pattern mining method to provide easy understanding for the classifier as a rule, which can be applied for recommendation or decision support systems. Instead of focusing on the frequent pattern, the contrast pattern is more fitted to the nature of classification cases. This study defines a contrast pattern as a distinguishing sub-sequence; a sub-sequence that mostly occurs in a certain class of sequences set yet rarely occurs in other classes of sequences set. Furthermore, we propose an efficient algorithm to find non-redundant rules that separated from a specific class to others. Hence, we aim to predict the class of unseen sequential data from its partial sequence sooner.

In addition, we may need to annotate large unlabeled data to solve the focused sequence classification. However, the annotation may be infeasible since the heavy examination and the perplexity from the complex nature of long sequences. Therefore, this study also aims to reduce the annotation cost and ambiguity. To do so, we performed two selections to draw the most representative distinguishing sub-sequence from a large pool set. The first is distinguishing selection to collect any distinguishing sub-sequences that are most likely in a certain class and less likely in another from the pool set. The second is refinement selection to pick one distinguishing sub-sequence with the smallest discrepancy among others. Moreover, we propose an active learning algorithm to iteratively take the most representative distinguishing sub-sequence from the pool set to be annotated until the accuracy model converges.

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Table of contents . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Abbreviations . . . . .. . . . . . . . . . . . . . . . . . . . . . . . x 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Backgrounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Aims and Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Dissertation outlines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Literature Reviews . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1 Sequence Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Conditional Random Fields . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Mining Minimal Distinguishing Subsequence Pattern . . . . . . . . . . . 9 2.4 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.5 Active Learning on Sequence Classification . . . . . . . . . . . . . . . . 12 3 Mining Minimal Distingushing Subsequence Patterns on Multi-class Classification 14 4 Mining Non-redundant Distinguishing Subsequence Rules . . . . . . . . . . . . 18 4.1 NConSeqMiner Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 Candidate sub-sequence generation . . . . . . . . . . . . . . . . . . . . . 21 4.3 Support calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.4 Redundant rules removal . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5 Distinguishing Active Learning for Sequential Data . . . . . . . . . . . . . . . 28 5.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.2 The Distinguishing Selection . . . . . . . . . . . . . . . . . . . . . . . . 31 5.3 The Refinement Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.4 Active Distinguishing Sub-sequence . . . . . . . . . . . . . . . . . . . . 37 5.5 Horizontal Batch Selection . . . . . . . . . . . . . . . . . . . . . . . . . 39 6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.1 Dataset Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.2 Sequence Classification Results . . . . . . . . . . . . . . . . . . . . . . . 44 6.2.1 Experimental Settings . . . . . . . . . . . . . . . . . . . . . . . 44 6.2.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 46 6.2.3 Small-scale dataset . . . . . . . . . . . . . . . . . . . . . . . . . 47 6.2.4 Large-scale dataset . . . . . . . . . . . . . . . . . . . . . . . . . 55 6.3 Active Learning Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 6.3.1 Experimental Settings . . . . . . . . . . . . . . . . . . . . . . . 59 6.3.2 Text Classification . . . . . . . . . . . . . . . . . . . . . . . . . 59 6.3.3 Trip Destination Prediction . . . . . . . . . . . . . . . . . . . . . 62 7 Conclusion and Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 7.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Related Publications . . . .. . . . . . . . . . . . . . . . . . . . . . . . . 72 Biography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

[1] R. Agrawal, T. Imieliński, and A. Swami, “Mining Association Rules Between Sets of Items in Large Databases,” in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’93. ACM, 1993, pp. 207–216.
[2] A. W. M. Brigham Anderson, Sajid M. Siddiqi, “Sequence Selection for Active Learning,” Carnegie Melon University, Tech. Rep. CMU Technical report CMU-RI-06-16, 2006, Apr 2006.
[3] Y. Bu, J. Lu, and V. V. Veeravalli, “Active and Adaptive Sequential Learning with Per Time-step Excess Risk Guarantees,” in 2019 53rd Asilomar Conference on Signals, Systems, and Computers, 2019, pp. 1606–1610.
[4] D. M. Chan, S. Vijayanarasimhan, D. A. Ross, and J. F. Canny, “Active Learning for Video Description with Cluster-Regularized Ensemble Ranking,” in Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30 - December 4, 2020, Revised Selected Papers, Part V, ser. Lecture Notes in Computer Science, vol. 12626. Springer, 2020, pp. 443–459.
[5] A. Chriswanto, H. Pao, and Y. Lee, “A Unified Approach on Active Learning Dual Supervision,” in 2019 International Joint Conference on Neural Networks (IJCNN), July 2019, pp. 1–8.
[6] V. Claveau and E. Kijak, “Strategies to Select Examples for Active Learning with Conditional Random Fields,” in Computational Linguistics and Intelligent Text Processing, A. Gelbukh, Ed. Springer International Publishing, 2018, pp. 30–43.
[7] T. G. Dietterich, “Machine learning for sequential data: A review,” in Structural, Syntactic, and Statistical Pattern Recognition, T. Caelli, A. Amin, R. P. W. Duin, D. de Ridder, and M. Kamel, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2002, pp. 15–30.
[8] K. Dwyer and R. Holte, “Decision tree instability and active learning,” in Machine Learning: ECML 2007, J. N. Kok, J. Koronacki, R. L. d. Mantaras, S. Matwin, D. Mladenič, and A. Skowron, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007, pp. 128–139.
[9] ECML-PKDD-Challenge, “Ecml/pkdd 15: Taxi trajectory prediction (i) - dataset,” https://www.kaggle.com/c/pkdd-15-predict-taxi-service-trajectory-i/data, accessed January 19, 2018.
[10] En-Zheng Guan, Xiao-Yu Chang, Zhe Wang, and Chun-Guang Zhou, “Mining Maximal Sequential Patterns,” in 2005 International Conference on Neural Networks and Brain, vol. 1, Oct 2005, pp. 525–528.
[11] R. Fidalgo-Merino, L. Gabrielli, and E. Checchi, “Leveraging Sequential Pattern Information for Active Learning from Sequential Data,” in 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 6957–6964.
[12] P. Fournier-Viger, A. Gomariz, M. Šebek, and M. Hlosta, “VGEN: Fast Vertical Mining of Sequential Generator Patterns,” in Data Warehousing and Knowledge Discovery. Springer International Publishing, 2014, pp. 476–488.
[13] P. Fournier-Viger, C.-W. Wu, A. Gomariz, and V. S. Tseng, “VMSP: Efficient Vertical Mining of Maximal Sequential Patterns,” in Advances in Artificial Intelligence. Springer International Publishing, 2014, pp. 83–94.
[14] C. Gao, J. Wang, Y. He, and L. Zhou, “Efficient Mining of Frequent Sequence Generators,” in Proceedings of the 17th International Conference on World Wide Web, ser. WWW ’08. ACM, 2008, pp. 1051–1052.
[15] L. Geng and H. J. Hamilton, “Interestingness Measures for Data Mining: A Survey,” ACM Comput. Surv., vol. 38, no. 3, p. 9–es, sep 2006. [Online]. Available: https://doi.org/10.1145/1132960.1132963
[16] Google-Developers, “Google maps API, https://developers.google.com/maps/documentation/ geocoding/start?csw=1#Limits, accessed January 19, 2018.
[17] C. Guilherme, P. H. Bugatti, and P. T. M. Saito, “Active semi-supervised learning for biological data classification,” PLOS ONE, vol. 25, no. 8, pp. 1–20, 2020.
[18] M. Hasan, S. Paul, A. I. Mourikis, and A. K. Roy-Chowdhury, “Context-Aware Query Selection for Active Learning in Event Recognition,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 42, no. 3, pp. 554–567, 2020.
[19] G. He, Y. Li, and W. Zhao, “An uncertainty and density based active semisupervised learning scheme for positive unlabeled multivariate time series classification,” Knowledge-Based Systems, vol. 124, pp. 80–92, 2017. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0950705117301235
[20] E. Huang, H. Pao, and Y. Lee, “Big active learning,” in 2017 IEEE International Conference on Big Data (Big Data), Dec 2017, pp. 94–101.
[21] M. Iqbal and H.-K. Pao, “Mining non-redundant distinguishing subsequence for trip destination forecasting,” Knowledge-Based Systems, vol. 211, p. 106519, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0950705120306481
[22] X. Ji, J. Bailey, and G. Dong, “Mining Minimal Distinguishing Subsequence Patterns with Gap Constraints,” in Proceedings of the Fifth IEEE International Conference on Data Mining, ser. ICDM’05. IEEE Computer Society, 2005, pp. 194–201.
[23] J. Kremer, K. Steenstrup Pedersen, and C. Igel, “Active learning with support vector machines,” WIREs Data Mining and Knowledge Discovery, vol. 4, no. 4, pp. 313–326, 2014. [Online]. Available:
https://wires.onlinelibrary.wiley.com/doi/abs/10.1002/widm.1132
[24] J. D. Lafferty, A. McCallum, and F. C. N. Pereira, “Conditional Random Fields:
Probabilistic Models for Segmenting and Labeling Sequence Data,” in Proceedings of the Eighteenth International Conference on Machine Learning, ser. ICML ’01. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2001, pp. 282–289. [Online]. Available: http://dl.acm.org/citation.cfm?id=645530.655813
[25] D. D. Lewis and W. A. Gale, “A Sequential Algorithm for Training Text Classifiers,” in Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information
Retrieval, ser. SIGIR ’94. New York, NY, USA: Springer-Verlag New York, Inc., 1994, pp. 3–12. [26] Y. Li, J. Bailey, L. Kulik, and J. Pei, “Mining Probabilistic Frequent Spatio-Temporal Sequential Patterns with Gap Constraints from Uncertain Databases,” in 2013 IEEE 13th International Conference on Data Mining. IEEE, 2013, pp. 448–457.
[27] L. Lin, Z. Luo, D. Hong, and H. Wang, “Sequential Learning with Active Partial Labeling for Building Metadata,” in Proceedings of the 6th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, ser. BuildSys ’19. New York, NY, USA: Association for Computing Machinery, 2019, p. 189–192. [Online]. Available: https://doi.org/10.1145/3360322.3360866
[28] T. Liu and G. Agrawal, “Active learning based frequent itemset mining over the deep web,” in 2011 IEEE 27th International Conference on Data Engineering, April 2011, pp. 219–230.
[29] C. Luo and S. M. Chung, “Efficient Mining of Maximal Sequential Patterns Using Multiple Samples,” in Proceedings of the 2005 SIAM International Conference on Data Mining, pp. 415–426.
[30] R. M. Marcacini, G. N. Corrêa, and S. O. Rezende, “An active learning approach to frequent itemset based text clustering,” in Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), 2012, pp. 3529–3532.
[31] F. Peng, Q. Luo, and L. M. Ni, “ACTS: An Active Learning Method for Time Series Classification,” in 2017 IEEE 33rd International Conference on Data Engineering (ICDE), 2017, pp. 175–178.
[32] P. Radmard, Y. Fathullah, and A. Lipani, “Subsequence based deep active learning for named entity recognition,” in Proceedings of the 59th Annual Meeting of the Association for Computational
Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Online: Association for Computational Linguistics, Aug. 2021, pp. 4310–4321. [Online]. Available: https://aclanthology.org/2021.acl-long.332
[33] A. Rakesh and S. Ramakrishnan, “Mining sequential patterns,” in Proceedings of the Eleventh International Conference on Data Engineering. IEEE, 1995, pp. 3–14.
[34] Q. L. . X. C. Ronghui Wu, “Mining contrast sequential pattern based on subsequence time distribution variation with discreteness constraints,” Applied Intelligence, vol. 49, p. 4348–4360, 2019.
[35] H. Saal, J. Ting, and S. Vijayakumar, “Active Sequential Learning with Tactile Feedback,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, Y. W. Teh and M. Titterington, Eds., vol. 9. Chia Laguna Resort,
Sardinia, Italy: PMLR, 13–15 May 2010, pp. 677–684.
[36] B. Schölkopf, J. Platt, and T. Hofmann, Training Conditional Random Fields for Maximum Labelwise Accuracy, 2007, pp. 529–536.
[37] B. Settles, Active Learning. Morgan & Claypool Publishers, 2012.
[38] B. Settles and M. Craven, “An Analysis of Active Learning Strategies for Sequence Labeling Tasks,” in
Proceedings of the Conference on Empirical Methods in Natural Language Processing, ser. EMNLP’08. Stroudsburg, PA, USA: Association for Computational Linguistics, 2008, pp. 1070–1079.
[39] B. Settles, M. Craven, and S. Ray, “Multiple-Instance Active Learning,” in Advances in Neural Information Processing Systems, J. Platt, D. Koller, Y. Singer, and S. Roweis, Eds., vol. 20. Curran Associates, Inc., 2008. [Online]. Available: https://proceedings.neurips.cc/paper/2007/file/a1519de5b5d44b31a01de013b9b51a80-Paper.pdf
[40] C. E. Shannon, “A mathematical theory of communication,” The Bell System Technical Journal, vol. 27, no. 3, pp. 379–423, 1948.
[41] R. M. Silva, M. A. Gonçalves, and A. Veloso, “A Two-Stage Active Learning Method for Learning to Rank,” J. Assoc. Inf. Sci. Technol., vol. 65, no. 1, p. 109–128, Jan. 2014.
[42] A. Soares Júnior, C. Renso, and S. Matwin, “ANALYTiC: An Active Learning System for Trajectory Classification,” IEEE Computer Graphics and Applications, vol. 37, no. 5, pp. 28–39, 2017.
[43] C. Sutton and A. McCallum, “An Introduction to Conditional Random Fields,” Found. Trends Mach. Learn., vol. 4, no. 4, p. 267–373, apr 2012. [Online]. Available: https://doi.org/10.1561/2200000013
[44] E. F. Tjong Kim Sang, “Introduction to the CoNLL-2002 shared task: Language-independent named entity recognition,” in COLING-02: The 6th Conference on Natural Language Learning 2002 (CoNLL-2002), 2002. [Online]. Available: https://aclanthology.org/W02-2024
[45] B. Vo and B. Le, “Fast algorithm for mining minimal generators of frequent closed itemsets and their applications,” in 2009 International Conference on Computers Industrial Engineering, July 2009, pp.
1407–1411.
[46] J. Wang and J. Han, “BIDE: Efficient Mining of Frequent Closed Sequences,” in Proceedings. 20th International Conference on Data Engineering. IEEE, 2004.
70[47] D. Wanvarie, H. Takamura, and M. Okumura, “Active Learning with Subsequence Sampling Strategy for Sequence Labeling Tasks,” Journal of Natural Language Processing, vol. 18, no. 2, pp. 153–173, 2011.
[48] D. Wu, “Pool-Based Sequential Active Learning for Regression,” IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 5, pp. 1348–1359, May 2019.
[49] Z. Xing, J. Pei, and E. Keogh, “A Brief Survey on Sequence Classification,” SIGKDD Explor. Newsl., vol. 12, no. 1, p. 40–48, Nov. 2010.
[50] X. Yan, J. Han, and R. Afshar, “CloSpan: Mining Closed Sequential Patterns in Large Datasets,” in In SDM, 2003, pp. 166–177.
[51] S. Yi, T. Zhao, Y. Zhang, S. Ma, and Z. Che, “An effective algorithm for mining sequential generators,” Procedia Engineering, vol. 15, pp. 3653 – 3657, 2011.
[52] C.-F. Yu and H.-K. Pao, “Virtual Adversarial Active Learning,” in 2020 IEEE International Conference on Big Data (Big Data), 2020, pp. 5323–5331.
[53] M. J. Zaki, “SPADE: An Efficient Algorithm for Mining Frequent Sequences,” Machine Learning, vol. 42, pp. 31–60, 2001.
[54] C. Zhang, K. Hu, H. Liu, Y. Ding, and L. Chen, “FMGSP: An Efficient Method of Mining Global Sequential Patterns,” in Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007), vol. 2, 2007, pp. 761–765.
[55] R. Zhang, Y. Yu, and C. Zhang, “Seqmix: Augmenting active sequence labeling via sequence mixup,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020.
[56] Y. Zhang, M. Lease, and B. C. Wallace, “Active discriminative text representation learning,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, ser. AAAI’17. AAAI Press, 2017, p. 3386–3392

QR CODE