整合MiniRocket 與遺傳演算法為基礎之集成方法於時間序列分類-以預測性維護為例

簡易檢索 / 詳目顯示

回結果列表

研究生：	徐振軒 Zhen-Xuan Xu
論文名稱：	整合MiniRocket 與遺傳演算法為基礎之集成方法於時間序列分類-以預測性維護為例 Integration of MiniRocket and GA-based ensemble method for time series classification- A case study of predictive maintenance
指導教授：	郭人介 Ren-Jieh Kuo
口試委員:	羅士哲 Shih-Che Lo 許嘉裕 Chia-Yu Hsu
學位類別：	碩士 Master
系所名稱：	管理學院 - 工業管理系 Department of Industrial Management
論文出版年：	2022
畢業學年度：	110
語文別：	英文
論文頁數：	77
中文關鍵詞：	時間序列、震動感測器、遺傳演算法、集成學習
外文關鍵詞：	Time series, Vibration sensor, Genetic algorithm, Ensemble learning
相關次數：	點閱：203 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

預測性維護的重要性在近年來得到了廣泛的重視，因為機器的意外崩潰會造成很多損失，而且通常需要大量的時間來製造和運送維修的部件。因為對於公司來說，儲存每一個可能需要的部件是不切實際且不符合經濟效益的，所以預測性維護在此時可以起到很大的幫助作用。
然而，當處理從機器上檢測到的時間序列數據時，傳統的統計方法通常會產生巨大的計算費用，並花費大量時間，但只能提供不精確的結果。因此，本研究提出了一種提高性能的方法，它將遺傳算法應用於集合學習，同時將MiniRocket從時間序列數據中提取的特徵作為輸入，希望以此來改善分析的結果。
本研究使用了一個真實世界的數據集作為案例，該數據集由台灣一家提供電線和電纜的知名公司提供。本研究所提演算法的性能由四個指標來予以衡量，實驗結果顯示採用遺傳演算法為基礎的集成演算法較單獨使用個別演算法，可產生較佳的結果。

The importance of predictive maintenance has been valued recently, because the unexpected crash of machine would cause many losses and it usually takes a lot of time to manufacture and ship the components for repairing. Since it is unrealistic and not economical for company to store every component it may need, predictive maintenance can be helpful.
However, when dealing with time series data detected from machines, the traditional statistical methods usually generate huge computational expenses and take enormous time but can only provide imprecise results. Therefore, this study proposes a method to improve the time series classification performance. It applies a genetic algorithm (GA) to ensemble learning while takes features extracted from time series data by MiniRocket as input.
Besides the benchmark datasets, a real-world application is also used in the current study, which is offered by a well-known company providing wires and cables in Taiwan. The proposed algorithm's performance is measured by four indicators. According to the experiment results, employing GA for ensemble learning method is able to yield better outcomes than utilizing individual algorithms alone both for benchmark and case study datasets.

摘要    I
ABSTRACT    II
致謝    III
CONTENTS    IV
List of Tables    VII
List of Figures    VIII
CHAPTER 1    Introduction    1
1    Research Background and Motivation    1
2    Research Objectives    3
3    Research Scope and Constraints    3
4    Thesis Organization    3
CHAPTER 2    Literature Review    5
1    Time Series Feature Extraction Methods    5
1.1    Dictionary-based approach    5
1.2    Distance-based approach    6
1.3    Interval-based approach    6
1.4    Frequency-based approach    7
1.5    Shapelet-based approach    8
2    MiniRocket Algorithm    9
3    Classifiers    11
3.1    ANN    11
3.2    SVM    12
3.3    Random Forest    12
3.4    XGBoost    13
3.5    LightGBM    14
4    Genetic Algorithm    15
4.1    Initial population    16
4.2    Fitness calculation    16
4.3    Parents Selection    16
4.4    Crossover    17
4.5    Mutation    17
5    Ensemble Learning Methods    17
5.1    Bagging    18
5.2    Boosting    19
5.3    Stacking    19
5.4    GA-Based Ensemble Learning    20
CHAPTER 3    Methodology    21
1    Research Framework    21
2    Data Preprocessing    22
3    MiniRocket Transform    22
3.1    Hyperparameters For Moving Randomness in ROCKET Away from MiniRocket    23
3.2    Pooling of convolutional output    24
3.3    Algorithm    24
4    GA-Based Ensemble Learning    27
CHAPTER 4    Experimental Results    31
1    Datasets    31
2    Performance Measurement    31
3    Parameters Setting    34
4    Experiment Results    35
5    Statistical Hypothesis    42
6    Computational Time    47
CHAPTER 5    Case Study    48
1    Predictive Maintenance    48
2    Dataset    48
3    Experiment Results    49
4    Statistical Hypothesis    53
5    Computational Time    55
CHAPTER 6    Conclusions and Future Research    56
1    Conclusions    56
2    Contributions    56
3    Future Research    56
REFERENCES    58
APPENDIX    64


                                

Abanda, A., Mori, U., & Lozano, J. A. (2019). A review on distance based time series classification. Data Mining and Knowledge Discovery, 33(2), 378-412.
Amani, F., Abdollahi, J., Mohammadnia, A., Amani, P., & Fattahzadeh-Ardalani, G. (2022). Using stacking methods based genetic algorithm to predict the time between symptom onset and hospital arrival in stroke patients and its related factors. Journal of Biostatistics and Epidemiology, 8(1), 8-23.
Angelova, M., & Pencheva, T. (2011). Tuning genetic algorithm parameters to improve convergence time. International Journal of Chemical Engineering, 2011.
Bentéjac, C., Csörgő, A., & Martínez-Muñoz, G. (2021). A comparative analysis of gradient boosting algorithms. Artificial Intelligence Review, 54(3), 1937-1967.
Biau, G., & Scornet, E. (2016). A random forest guided tour. Test, 25(2), 197-227.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123-140.
Brockwell, P. J. (2001). Continuous-time ARMA processes. Handbook of Statistics, 19, 249-276.
Cao, Z., Li, Z., Zhang, J., & Fu, H. (2022). A homogeneous stacking ensemble learning model for fault diagnosis of rotating machinery with small samples. IEEE Sensors Journal, 22(9), 8944-8959.
Carvalho, T. P., Soares, F. A., Vita, R., Francisco, R. D. P., Basto, J. P., & Alcalá, S. G. (2019). A systematic literature review of machine learning methods applied to predictive maintenance. Computers & Industrial Engineering, 137, 106024.
Castro, N., & Azevedo, P. (2010, April). Multiresolution motif discovery in time series. Proceedings of the 2010 SIAM International Conference on Data Mining, 665-676. Society for Industrial and Applied Mathematics.
Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794.
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21-27.
David H. Wolpert (1992) Stacked generalization. Neural Networks, 5(2), 241-259.
Deb, K., & Agrawal, S. (1999). Understanding interactions among genetic algorithm parameters. Foundations of Genetic Algorithms, 5(5), 265-286.
Dempster, A., Petitjean, F., & Webb, G. I. (2020). ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels. Data Mining and Knowledge Discovery, 34(5), 1454-1495.
Dempster, A., Schmidt, D. F., & Webb, G. I. (2021, August). Minirocket: A very fast (almost) deterministic transform for time series classification. Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, 248-257.
Deng, H., Runger, G., Tuv, E., & Vladimir, M. (2013). A time series forest for classification and feature extraction. Information Sciences, 239, 142-153.
Dhariyal, B., Le Nguyen, T., Gsponer, S., & Ifrim, G. (2020, November). An examination of the state-of-the-art for multivariate time series classification. Proceedings of the 2020 International Conference on Data Mining Workshops (ICDMW), 243-250. IEEE.
Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., & Keogh, E. (2008). Querying and mining of time series data: experimental comparison of representations and distance measures. Proceedings of the VLDB Endowment, 1(2), 1542-1552.
Dong, X., Yu, Z., Cao, W., Shi, Y., & Ma, Q. (2020). A survey on ensemble learning. Frontiers of Computer Science, 14(2), 241-258.
Džeroski, S., & Ženko, B. (2004). Is combining classifiers with stacking better than selecting the best one?. Machine learning, 54(3), 255-273.
Faouzi, J. (2022). Time series classification: a review of algorithms and implementations. Machine Learning (Emerging Trends and Applications).
Fu, T. C. (2011). A review on time series data mining. Engineering Applications of Artificial Intelligence, 24(1), 164-181.
Gudmundsson, S., Runarsson, T. P., & Sigurdsson, S. (2008, June). Support vector machines and dynamic time warping for time series. Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 2772-2776. IEEE.
Hall, A. R. (2004). Generalized method of moments. OUP Oxford.
Han, S., Qubo, C., & Meng, H. (2012, June). Parameter selection in SVM with RBF kernel function. World Automation Congress 2012, 1-4.
Hashemian, H. M. (2010). State-of-the-art predictive maintenance techniques. IEEE Transactions on Instrumentation and measurement, 60(1), 226-236.
Holland, J. H. (1975). Genetic algorithms. Scientific American, 267(1), 66-73.
Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L., & Muller, P. A. (2019). Deep learning for time series classification: a review. Data mining and Knowledge Discovery, 33(4), 917-963.
Jain, A. K., Mao, J., & Mohiuddin, K. M. (1996). Artificial neural networks: a tutorial. Computer, 29(3), 31-44.
Kasten, E. P., McKinley, P. K., & Gage, S. H. (2007, June). Automated ensemble extraction and analysis of acoustic data streams. Proceedings of the 27th International Conference on Distributed Computing Systems Workshops (ICDCSW'07), 66-66. IEEE.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q, & Liu, T. Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30.
Keogh, E., Chakrabarti, K., Pazzani, M., & Mehrotra, S. (2001). Dimensionality reduction for fast similarity search in large time series databases. Knowledge and Information Systems, 3(3), 263-286.
Khoei, T. T., Labuhn, M. C., Caleb, T. D., Hu, W. C., & Kaabouch, N. (2021, May). A stacking-based ensemble learning model with genetic algorithm for detecting early stages of Alzheimer’s disease. Proceedings of the 2021 IEEE International Conference on Electro Information Technology (EIT), 215-222. IEEE.
Koza, J. R. (1994). Genetic programming as a means for programming computers by natural selection. Statistics and Computing, 4(2), 87-112.
LeCun, Y., Jackel, L., Bottou, L., Brunot, A., Cortes, C., Denker, J., Drucker H., Drucker, I., Muller, U., Sackinger, E., Simard, P., & Vapnik, V. (1995). Comparison of learning algorithms for handwritten digit recognition. Proceedings of the International Conference on Artificial Neural Networks, 60(1), 53-60.
Liang, Y., Zhang, M., & Browne, W. N. (2017). Genetic programming for evolving figure-ground segmentors from multiple features. Applied Soft Computing, 51, 83-95.
Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 2(3), 18-22.
Lin, J., Keogh, E., Wei, L., & Lonardi, S. (2007). Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery, 15(2), 107-144.
Lines, J., Taylor, S., & Bagnall, A. (2016, December). Hive-cote: The hierarchical vote collective of transformation-based ensembles for time series classification. Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM), 1041-1046. IEEE.
Middlehurst, M., Vickers, W., & Bagnall, A. (2019, November). Scalable dictionary classifiers for time series classification. Proceedings of the International Conference on Intelligent Data Engineering and Automated Learning, 11-19. Springer, Cham.
Muller, M. (2007). Dynamic time warping in information retrieval for music and motion.
Niennattrakul, V., Ruengronghirunya, P., & Ratanamahatana, C. A. (2010). Exact indexing for massive time series databases under time warping distance. Data Mining and Knowledge Discovery, 21(3), 509-541.
Noble, W. S. (2006). What is a support vector machine?. Nature Biotechnology, 24(12), 1565-1567.
Ruiz, A. P., Flynn, M., Large, J., Middlehurst, M., & Bagnall, A. (2021). The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Mining and Knowledge Discovery, 35(2), 401-449.
Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249.
Sánchez-Sánchez, P. A., García-González, J. R., & Coronell, L. H. P. (2019). Encountered problems of time series with neural networks: Models and architectures. Recent Trends in Artificial Neural Networks-from Training to Prediction.
Schäfer, P. (2015). The BOSS is concerned with time series classification in the presence of noise. Data Mining and Knowledge Discovery, 29(6), 1505-1530.
Schäfer, P. (2016). Scalable time series classification. Data Mining and Knowledge Discovery, 30(5), 1273-1298.
Schäfer, P., & Högqvist, M. (2012, March). SFA: a symbolic fourier approximation and index for similarity search in high dimensional datasets. Proceedings of the 15th International Conference on Extending Database Technology, 516-527.
Schapire, R. E. (2003). The boosting approach to machine learning: An overview. Nonlinear Estimation and Classification, 149-171.
Selcuk, S. (2017). Predictive maintenance, its implementation and latest trends. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture, 231(9), 1670-1679.
Sikora, R. (2015). A modified stacking ensemble machine learning algorithm using genetic algorithms. Handbook of Research on Organizational Transformations through Big Data Analytics, 43-53. IGi Global.
Susmaga, R. (2004). Confusion matrix visualization. Intelligent Information Processing and Web Mining, 107-116. Springer, Berlin, Heidelberg.
Tang, Y. (2013). Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239.
Torlay, L., Perrone-Bertolotti, M., Thomas, E., & Baciu, M. (2017). Machine learning–XGBoost analysis of language networks to classify patients with epilepsy. Brain Informatics, 4(3), 159-169.
Tsakiridis, N. L., Tziolas, N. V., Theocharis, J. B., & Zalidis, G. C. (2019). A genetic algorithm‐based stacking algorithm for predicting soil organic matter from vis–NIR spectral data. European Journal of Soil Science, 70(3), 578-590.
Wang, Z., Yan, W., & Oates, T. (2017, May). Time series classification from scratch with deep neural networks: A strong baseline. Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), 1578-1585. IEEE.
Whitley, D. (1994). A genetic algorithm tutorial. Statistics and Computing, 4(2), 65-85.
Wistreich, J. G. (1958). The fundamentals of wire drawing. Metallurgical Reviews, 3(1), 97-142.
Ye, L., & Keogh, E. (2009). Time series shapelets: a new primitive for data mining. Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 947-956.
Zang, W., Zhang, P., Zhou, C., & Guo, L. (2014). Comparative study between incremental and ensemble learning on data streams: Case study. Journal of Big Data, 1(1), 1-16.
Zhang, Y. (2012, September). Support vector machine classification algorithm and its application. Proceedings of the International Conference on Information Computing and Applications, 179-186. Springer, Berlin, Heidelberg.
Zhang, Y., & Haghani, A. (2015). A gradient boosting method to improve travel time prediction. Transportation Research Part C: Emerging Technologies, 58, 308-324.
Zhou, Z. H. (2021). Ensemble learning. Machine Learning, 181-210. Springer, Singapore.
Zonta, T., Da Costa, C. A., da Rosa Righi, R., de Lima, M. J., da Trindade, E. S., & Li, G. P. (2020). Predictive maintenance in the Industry 4.0: A systematic literature review. Computers & Industrial Engineering, 150, 106889.

全文公開日期 2025/09/19 (校內網路)
全文公開日期 2032/09/19 (校外網路)
全文公開日期 2042/09/19 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文