簡易檢索 / 詳目顯示

研究生: 徐振軒
Zhen-Xuan Xu
論文名稱: 整合MiniRocket 與遺傳演算法為基礎之集成方法於時間序列分類-以預測性維護為例
Integration of MiniRocket and GA-based ensemble method for time series classification- A case study of predictive maintenance
指導教授: 郭人介
Ren-Jieh Kuo
口試委員: 羅士哲
Shih-Che Lo
許嘉裕
Chia-Yu Hsu
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 77
中文關鍵詞: 時間序列震動感測器遺傳演算法集成學習
外文關鍵詞: Time series, Vibration sensor, Genetic algorithm, Ensemble learning
相關次數: 點閱:203下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 預測性維護的重要性在近年來得到了廣泛的重視,因為機器的意外崩潰會造成很多損失,而且通常需要大量的時間來製造和運送維修的部件。因為對於公司來說,儲存每一個可能需要的部件是不切實際且不符合經濟效益的,所以預測性維護在此時可以起到很大的幫助作用。
    然而,當處理從機器上檢測到的時間序列數據時,傳統的統計方法通常會產生巨大的計算費用,並花費大量時間,但只能提供不精確的結果。因此,本研究提出了一種提高性能的方法,它將遺傳算法應用於集合學習,同時將MiniRocket從時間序列數據中提取的特徵作為輸入,希望以此來改善分析的結果。
    本研究使用了一個真實世界的數據集作為案例,該數據集由台灣一家提供電線和電纜的知名公司提供。本研究所提演算法的性能由四個指標來予以衡量,實驗結果顯示採用遺傳演算法為基礎的集成演算法較單獨使用個別演算法,可產生較佳的結果。


    The importance of predictive maintenance has been valued recently, because the unexpected crash of machine would cause many losses and it usually takes a lot of time to manufacture and ship the components for repairing. Since it is unrealistic and not economical for company to store every component it may need, predictive maintenance can be helpful.
    However, when dealing with time series data detected from machines, the traditional statistical methods usually generate huge computational expenses and take enormous time but can only provide imprecise results. Therefore, this study proposes a method to improve the time series classification performance. It applies a genetic algorithm (GA) to ensemble learning while takes features extracted from time series data by MiniRocket as input.
    Besides the benchmark datasets, a real-world application is also used in the current study, which is offered by a well-known company providing wires and cables in Taiwan. The proposed algorithm's performance is measured by four indicators. According to the experiment results, employing GA for ensemble learning method is able to yield better outcomes than utilizing individual algorithms alone both for benchmark and case study datasets.

    摘要 I ABSTRACT II 致謝 III CONTENTS IV List of Tables VII List of Figures VIII CHAPTER 1 Introduction 1 1.1 Research Background and Motivation 1 1.2 Research Objectives 3 1.3 Research Scope and Constraints 3 1.4 Thesis Organization 3 CHAPTER 2 Literature Review 5 2.1 Time Series Feature Extraction Methods 5 2.1.1 Dictionary-based approach 5 2.1.2 Distance-based approach 6 2.1.3 Interval-based approach 6 2.1.4 Frequency-based approach 7 2.1.5 Shapelet-based approach 8 2.2 MiniRocket Algorithm 9 2.3 Classifiers 11 2.3.1 ANN 11 2.3.2 SVM 12 2.3.3 Random Forest 12 2.3.4 XGBoost 13 2.3.5 LightGBM 14 2.4 Genetic Algorithm 15 2.4.1 Initial population 16 2.4.2 Fitness calculation 16 2.4.3 Parents Selection 16 2.4.4 Crossover 17 2.4.5 Mutation 17 2.5 Ensemble Learning Methods 17 2.5.1 Bagging 18 2.5.2 Boosting 19 2.5.3 Stacking 19 2.5.4 GA-Based Ensemble Learning 20 CHAPTER 3 Methodology 21 3.1 Research Framework 21 3.2 Data Preprocessing 22 3.3 MiniRocket Transform 22 3.3.1 Hyperparameters For Moving Randomness in ROCKET Away from MiniRocket 23 3.3.2 Pooling of convolutional output 24 3.3.3 Algorithm 24 3.4 GA-Based Ensemble Learning 27 CHAPTER 4 Experimental Results 31 4.1 Datasets 31 4.2 Performance Measurement 31 4.3 Parameters Setting 34 4.4 Experiment Results 35 4.5 Statistical Hypothesis 42 4.6 Computational Time 47 CHAPTER 5 Case Study 48 5.1 Predictive Maintenance 48 5.2 Dataset 48 5.3 Experiment Results 49 5.4 Statistical Hypothesis 53 5.5 Computational Time 55 CHAPTER 6 Conclusions and Future Research 56 6.1 Conclusions 56 6.2 Contributions 56 6.3 Future Research 56 REFERENCES 58 APPENDIX 64

    Abanda, A., Mori, U., & Lozano, J. A. (2019). A review on distance based time series classification. Data Mining and Knowledge Discovery, 33(2), 378-412.
    Amani, F., Abdollahi, J., Mohammadnia, A., Amani, P., & Fattahzadeh-Ardalani, G. (2022). Using stacking methods based genetic algorithm to predict the time between symptom onset and hospital arrival in stroke patients and its related factors. Journal of Biostatistics and Epidemiology, 8(1), 8-23.
    Angelova, M., & Pencheva, T. (2011). Tuning genetic algorithm parameters to improve convergence time. International Journal of Chemical Engineering, 2011.
    Bentéjac, C., Csörgő, A., & Martínez-Muñoz, G. (2021). A comparative analysis of gradient boosting algorithms. Artificial Intelligence Review, 54(3), 1937-1967.
    Biau, G., & Scornet, E. (2016). A random forest guided tour. Test, 25(2), 197-227.
    Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123-140.
    Brockwell, P. J. (2001). Continuous-time ARMA processes. Handbook of Statistics, 19, 249-276.
    Cao, Z., Li, Z., Zhang, J., & Fu, H. (2022). A homogeneous stacking ensemble learning model for fault diagnosis of rotating machinery with small samples. IEEE Sensors Journal, 22(9), 8944-8959.
    Carvalho, T. P., Soares, F. A., Vita, R., Francisco, R. D. P., Basto, J. P., & Alcalá, S. G. (2019). A systematic literature review of machine learning methods applied to predictive maintenance. Computers & Industrial Engineering, 137, 106024.
    Castro, N., & Azevedo, P. (2010, April). Multiresolution motif discovery in time series. Proceedings of the 2010 SIAM International Conference on Data Mining, 665-676. Society for Industrial and Applied Mathematics.
    Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794.
    Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21-27.
    David H. Wolpert (1992) Stacked generalization. Neural Networks, 5(2), 241-259.
    Deb, K., & Agrawal, S. (1999). Understanding interactions among genetic algorithm parameters. Foundations of Genetic Algorithms, 5(5), 265-286.
    Dempster, A., Petitjean, F., & Webb, G. I. (2020). ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels. Data Mining and Knowledge Discovery, 34(5), 1454-1495.
    Dempster, A., Schmidt, D. F., & Webb, G. I. (2021, August). Minirocket: A very fast (almost) deterministic transform for time series classification. Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, 248-257.
    Deng, H., Runger, G., Tuv, E., & Vladimir, M. (2013). A time series forest for classification and feature extraction. Information Sciences, 239, 142-153.
    Dhariyal, B., Le Nguyen, T., Gsponer, S., & Ifrim, G. (2020, November). An examination of the state-of-the-art for multivariate time series classification. Proceedings of the 2020 International Conference on Data Mining Workshops (ICDMW), 243-250. IEEE.
    Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., & Keogh, E. (2008). Querying and mining of time series data: experimental comparison of representations and distance measures. Proceedings of the VLDB Endowment, 1(2), 1542-1552.
    Dong, X., Yu, Z., Cao, W., Shi, Y., & Ma, Q. (2020). A survey on ensemble learning. Frontiers of Computer Science, 14(2), 241-258.
    Džeroski, S., & Ženko, B. (2004). Is combining classifiers with stacking better than selecting the best one?. Machine learning, 54(3), 255-273.
    Faouzi, J. (2022). Time series classification: a review of algorithms and implementations. Machine Learning (Emerging Trends and Applications).
    Fu, T. C. (2011). A review on time series data mining. Engineering Applications of Artificial Intelligence, 24(1), 164-181.
    Gudmundsson, S., Runarsson, T. P., & Sigurdsson, S. (2008, June). Support vector machines and dynamic time warping for time series. Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 2772-2776. IEEE.
    Hall, A. R. (2004). Generalized method of moments. OUP Oxford.
    Han, S., Qubo, C., & Meng, H. (2012, June). Parameter selection in SVM with RBF kernel function. World Automation Congress 2012, 1-4.
    Hashemian, H. M. (2010). State-of-the-art predictive maintenance techniques. IEEE Transactions on Instrumentation and measurement, 60(1), 226-236.
    Holland, J. H. (1975). Genetic algorithms. Scientific American, 267(1), 66-73.
    Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L., & Muller, P. A. (2019). Deep learning for time series classification: a review. Data mining and Knowledge Discovery, 33(4), 917-963.
    Jain, A. K., Mao, J., & Mohiuddin, K. M. (1996). Artificial neural networks: a tutorial. Computer, 29(3), 31-44.
    Kasten, E. P., McKinley, P. K., & Gage, S. H. (2007, June). Automated ensemble extraction and analysis of acoustic data streams. Proceedings of the 27th International Conference on Distributed Computing Systems Workshops (ICDCSW'07), 66-66. IEEE.
    Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q, & Liu, T. Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30.
    Keogh, E., Chakrabarti, K., Pazzani, M., & Mehrotra, S. (2001). Dimensionality reduction for fast similarity search in large time series databases. Knowledge and Information Systems, 3(3), 263-286.
    Khoei, T. T., Labuhn, M. C., Caleb, T. D., Hu, W. C., & Kaabouch, N. (2021, May). A stacking-based ensemble learning model with genetic algorithm for detecting early stages of Alzheimer’s disease. Proceedings of the 2021 IEEE International Conference on Electro Information Technology (EIT), 215-222. IEEE.
    Koza, J. R. (1994). Genetic programming as a means for programming computers by natural selection. Statistics and Computing, 4(2), 87-112.
    LeCun, Y., Jackel, L., Bottou, L., Brunot, A., Cortes, C., Denker, J., Drucker H., Drucker, I., Muller, U., Sackinger, E., Simard, P., & Vapnik, V. (1995). Comparison of learning algorithms for handwritten digit recognition. Proceedings of the International Conference on Artificial Neural Networks, 60(1), 53-60.
    Liang, Y., Zhang, M., & Browne, W. N. (2017). Genetic programming for evolving figure-ground segmentors from multiple features. Applied Soft Computing, 51, 83-95.
    Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 2(3), 18-22.
    Lin, J., Keogh, E., Wei, L., & Lonardi, S. (2007). Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery, 15(2), 107-144.
    Lines, J., Taylor, S., & Bagnall, A. (2016, December). Hive-cote: The hierarchical vote collective of transformation-based ensembles for time series classification. Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM), 1041-1046. IEEE.
    Middlehurst, M., Vickers, W., & Bagnall, A. (2019, November). Scalable dictionary classifiers for time series classification. Proceedings of the International Conference on Intelligent Data Engineering and Automated Learning, 11-19. Springer, Cham.
    Muller, M. (2007). Dynamic time warping in information retrieval for music and motion.
    Niennattrakul, V., Ruengronghirunya, P., & Ratanamahatana, C. A. (2010). Exact indexing for massive time series databases under time warping distance. Data Mining and Knowledge Discovery, 21(3), 509-541.
    Noble, W. S. (2006). What is a support vector machine?. Nature Biotechnology, 24(12), 1565-1567.
    Ruiz, A. P., Flynn, M., Large, J., Middlehurst, M., & Bagnall, A. (2021). The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Mining and Knowledge Discovery, 35(2), 401-449.
    Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249.
    Sánchez-Sánchez, P. A., García-González, J. R., & Coronell, L. H. P. (2019). Encountered problems of time series with neural networks: Models and architectures. Recent Trends in Artificial Neural Networks-from Training to Prediction.
    Schäfer, P. (2015). The BOSS is concerned with time series classification in the presence of noise. Data Mining and Knowledge Discovery, 29(6), 1505-1530.
    Schäfer, P. (2016). Scalable time series classification. Data Mining and Knowledge Discovery, 30(5), 1273-1298.
    Schäfer, P., & Högqvist, M. (2012, March). SFA: a symbolic fourier approximation and index for similarity search in high dimensional datasets. Proceedings of the 15th International Conference on Extending Database Technology, 516-527.
    Schapire, R. E. (2003). The boosting approach to machine learning: An overview. Nonlinear Estimation and Classification, 149-171.
    Selcuk, S. (2017). Predictive maintenance, its implementation and latest trends. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture, 231(9), 1670-1679.
    Sikora, R. (2015). A modified stacking ensemble machine learning algorithm using genetic algorithms. Handbook of Research on Organizational Transformations through Big Data Analytics, 43-53. IGi Global.
    Susmaga, R. (2004). Confusion matrix visualization. Intelligent Information Processing and Web Mining, 107-116. Springer, Berlin, Heidelberg.
    Tang, Y. (2013). Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239.
    Torlay, L., Perrone-Bertolotti, M., Thomas, E., & Baciu, M. (2017). Machine learning–XGBoost analysis of language networks to classify patients with epilepsy. Brain Informatics, 4(3), 159-169.
    Tsakiridis, N. L., Tziolas, N. V., Theocharis, J. B., & Zalidis, G. C. (2019). A genetic algorithm‐based stacking algorithm for predicting soil organic matter from vis–NIR spectral data. European Journal of Soil Science, 70(3), 578-590.
    Wang, Z., Yan, W., & Oates, T. (2017, May). Time series classification from scratch with deep neural networks: A strong baseline. Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), 1578-1585. IEEE.
    Whitley, D. (1994). A genetic algorithm tutorial. Statistics and Computing, 4(2), 65-85.
    Wistreich, J. G. (1958). The fundamentals of wire drawing. Metallurgical Reviews, 3(1), 97-142.
    Ye, L., & Keogh, E. (2009). Time series shapelets: a new primitive for data mining. Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 947-956.
    Zang, W., Zhang, P., Zhou, C., & Guo, L. (2014). Comparative study between incremental and ensemble learning on data streams: Case study. Journal of Big Data, 1(1), 1-16.
    Zhang, Y. (2012, September). Support vector machine classification algorithm and its application. Proceedings of the International Conference on Information Computing and Applications, 179-186. Springer, Berlin, Heidelberg.
    Zhang, Y., & Haghani, A. (2015). A gradient boosting method to improve travel time prediction. Transportation Research Part C: Emerging Technologies, 58, 308-324.
    Zhou, Z. H. (2021). Ensemble learning. Machine Learning, 181-210. Springer, Singapore.
    Zonta, T., Da Costa, C. A., da Rosa Righi, R., de Lima, M. J., da Trindade, E. S., & Li, G. P. (2020). Predictive maintenance in the Industry 4.0: A systematic literature review. Computers & Industrial Engineering, 150, 106889.

    無法下載圖示 全文公開日期 2025/09/19 (校內網路)
    全文公開日期 2032/09/19 (校外網路)
    全文公開日期 2042/09/19 (國家圖書館:臺灣博碩士論文系統)
    QR CODE