簡易檢索 / 詳目顯示

研究生: 邱辰昀
Chen-Yun Chiou
論文名稱: 應用卷積神經網路合併梯度提升決策樹模型 於多重輸出問題之工業大數據預測分析
Apply CNN-XGBoost Approach in Industrial Big Data Predictive Analytics for Multiple Output Problems
指導教授: 羅士哲
Shih-Che Lo
口試委員: 范書愷
Shu-Kai Fan
曹譽鐘
Yu-Chung Tsao
羅士哲
Shih-Che Lo
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 49
中文關鍵詞: 大數據預測分析卷積神經網絡梯度提升決策樹多輸出預測問題
外文關鍵詞: Big Data Predictive Analytics, Convolutional Neural Network, eXtreme Gradient Boosting Decision Tree, Multi-Output Prediction Problems
相關次數: 點閱:266下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 工業4.0為世界帶來了大數據分析的興起,大數據分析應用領域廣泛,而在工業領域的案例更是不勝枚舉,智慧型裝置與網路的普及、更加靈敏且便宜的感測元件,這些都是造就智慧工廠的幕後功臣。從感測器獲得的資料種類愈趨複雜化及多元化,且對於輸出目標也從常見的單一輸出需求轉化為多輸出需求。本研究提出了一個CNN-XGBoost模型架構,該模型將屬於機器學習領域的XGBoost模型與身處深度學習領域的CNN模型進行模型架接,CNN模型的優勢在於其降維機制,經過卷積層與池化層的運算,可將具有多個屬性值的資料的屬性數目降低,有利於後續模型的分析。
    在機器學習領域被廣泛利用的XGBoost模型,雖然計算效能及速度上相對其他模型來的佳,但對於多屬性資料的預測仍未理想,因此在CNN-XGBoost模型前段架接CNN模型的降維機制,而與原本CNN架構後端深度學習模型MLP模型相比,機器學習模型XGBoost在計算及運作上更簡單且更迅速,並滿足多輸出問題的預測需求。
    本研究使用了MSE作為模型效能評估指標,利用k-fold交叉驗證之方式穩定預測準確度,並與在深度學習領域常見的BPNN模型及機器學習領域知名的RF模型作為比較基準,實驗結果顯示CNN-XGBoost在模型準確度上與BPNN及RF的預測準確度非常接近,且比CNN及XGBoost來得更好,在模型運作效率上也比CNN模型快上許多。


    Industry 4.0 has brought the rise of big data analytics to the world. Big data analytics have a wide range of applications, and there are countless cases in the industrial field. The expansion of smart devices and networks, more sensitive and inexpensive sensing components, have contributed to the creation of smart factories. The CNN-XGBoost approach combines the XGBoost model in the field of machine learning with the CNN model in the field of deep learning. The advantage of the CNN model lies in its dimensionality reduction mechanism. The operation of the convolutional layer and the pooling layer reduces the number of attributes of data with multiple attribute values, which is conducive to the subsequent model analysis.
    The XGBoost model which is widely used in the field of machine learning is better than other models in terms of computational performance and speed. However, it's prediction performance of multi-attribute data is still not ideal, so the dimensional reduction mechanism of the CNN model is added to the front stage of the CNN-XGBoost model. Compared with the original CNN architecture deep learning model with MLP classification, the machine learning model XGBoost is simpler and faster in calculation and operation for multi-output prediction requirements.
    This study uses MSE as performance evaluation index. We use k-fold cross-validation to stabilize the prediction accuracy. Moreover, we develop the BPNN model in the deep learning field and the well-known RF model in the machine learning field for comparison. The results show that the prediction accuracy of the CNN-XGBoost approach is closed to the BPNN model and the RF model. It’s better than CNN and XGBoost. In addition, the model operation efficiency is much faster than CNN model.

    摘要..... I ABSTRACT ....... II 致謝...... III 目錄..... IV 圖目錄.....VI 表目錄.... VII 第一章 緒論..... 1 1.1 研究背景與動機..... 1 1.2 研究目標...... 3 1.3 研究架構..... 4 第二章 文獻回顧..... 6 2.1 大數據(Big Data).....6 2.2 機器學習(Machine Learning) ..... 8 2.3 深度學習(Deep Learning) ..... 9 2.4 卷機神經網絡(Convolutional Neural Networks, CNN) ......10 2.5 eXtreme 梯度增強模型(eXtreme Gradient Boosting, XGBoost) ..... 13 第三章 實驗方法..... 15 3.1 大數據.....15 3.2 隨機森林(Random Forest, RF)補值法 ......18 3.3 人工神經網路(Artificial Neural Network, ANN) .....20 3.4 卷積神經網絡(Convolutional Neural Networks, CNN) .....24 3.5 eXtreme 梯度增強模型(eXtreme Gradient Boosting, XGBoost) .....28 3.6 CNN-XGBoost 模型架構 .....31 3.7 預測績效指標..... 32 第四章 研究成果..... 33 4.1 實驗流程、模型架構、參數設定..... 33 4.2 模型預測結果與討論.....40 第五章 總結..... 43 5.1 結論..... 43 5.2 未來研究..... 44 第六章 參考文獻.... 45 6.1 中文文獻..... 45 6.2 英文文獻..... 45 【圖目錄】 圖1.1 數據分析的四種階段 ............................................ 2 圖1.2 研究流程圖 .................................................... 5 圖3.1 智慧工廠概念流程圖 ........................................... 17 圖3.2 CART 結構圖 ................................................. 19 圖3.3 隨機森林模型的過程 ........................................... 20 圖3.4 ANN 模型架構示意圖 .......................................... 21 圖3.5 BPNN 模型架構示意圖 ......................................... 22 圖3.6 BPNN 學習機制流程圖 ......................................... 22 圖3.7 一維度卷積運算示意圖 ......................................... 25 圖3.8 最大池化機制示意圖 ........................................... 26 圖3.9 特徵地圖展平過程示意圖 ....................................... 26 圖3.10 1D-CNN 架構示意圖 .......................................... 27 圖3.11 XGBOOST 架構示意圖 ......................................... 28 圖3.12 單輸出CNN-XGBOOST 模型架構示意圖 .......................... 32 圖4.1 實驗流程圖 ................................................... 33 圖4.2 多輸出CNN-XGBOOST 模型架構示意圖 ........................... 35 圖4.3 XGBOOST-A1 DOE 單因子實驗輸出結果 .......................... 37 圖4.4 XGBOOST-A2 DOE 單因子實驗輸出結果 .......................... 37 圖4.5 XGBOOST-A3 DOE 單因子實驗輸出結果 .......................... 38 圖4.6 XGBOOST-A4 DOE 單因子實驗輸出結果 .......................... 38 圖4.7 XGBOOST-A5 DOE 單因子實驗輸出結果 .......................... 39 圖4.8 XGBOOST-A6 DOE 單因子實驗輸出結果 .......................... 39 圖4.9 CNN-XGBOOST 模型收歛圖 ..................................... 40 【表目錄】 表3.1 大數據的三種類型 ............................................. 15 表4.1 卷積層與池化層參數設定 ....................................... 34 表4.2 降維階段各網絡層之輸出尺寸 ................................... 34 表4.3 MAX_DEPTH 與N_ESTIMATORS 的交互作用ANOVA 表 ................ 35 表4.4 MIN_CHILD_WEIGHT 與N_ESTIMATORS 的交互作用ANOVA 表 ......... 35 表4.5 LEARNING_RATE 與N_ESTIMATORS 的交互作用ANOVA 表 ............. 36 表4.6 MIN_CHILD_WEIGHT 與MAX_DEPTH 的交互作用ANOVA 表............ 36 表4.7 LEARNING_RATE 與MAX_DEPTH 的交互作用ANOVA 表 ............... 36 表4.8 LEARNING_RATE 與MIN_CHILD_WEIGHT 的交互作用ANOVA 表 ........ 36 表4.9 XGBOOST 最佳參數結果 ........................................ 40 表4.10 交叉驗證5-FOLD 五次預測結果之MSE 與其平均值 ................ 41 表4.11 多輸出模型對鑽頭加工品質值預測結果的MSE 比較表 ............. 42 表4.12 單輸出模型對鑽頭加工品質值預測結果的MSE 比較表 ............. 42 表4.13 CNN-XGBOOST、CNN 與XGBOOST 模型運行時間比較表 .......... 42

    【中文文獻】
    張育榮 (2020),「研究多重輸出預測問題之大數據預測分析中插補值的各種不同前處理方法之比較」,國立臺灣科技大學工業管理系碩士論文。
    【英文文獻】
    Bryson, A. E. (1961). “A gradient method for optimizing multi-stage allocation processes,” Proceedings of the Harvard University Symposium on digital computers and their applications, 3-6. Cambridge: Harvard University Press. OCLC 498866871.
    Bengio, Y. (2009). “Learning deep architectures for AI,” Foundations and Trends in Machine Learning, 2(1), 1-127. (ISBN: 9781601982940)
    Bottou, L., Chapelle, O., DeCoste, D. and Weston, J. (2007). “Scaling learning algorithms towards AI,” Large-Scale Kernel Machines Cambridge, 2007, 321-358. (DOI: 10.7551/mitpress/7496.003.0016)
    Bekkerman, R. (2015). “The present and the future of the KDD cup competition: an outsider’s perspective,” (Access March 2, 2021 from: https://www.linkedin.com/pulse/present-future-kdd-cup-competition-outsiders-ron-bekkerman)
    Breiman, L. (2001). “Random forests,” Machine Learning, 45, 5-32. (DOI: 10.1023/A:1010933404324)
    Chen, M., Mao, S. and Liu, Y. (2014). “Big data: a survey,” Mobile Networks and Applications, 19(2), 171-209. (DOI: 10.1007/s11036-013-0489-0)
    Chowdhury G. G. (2005). “Natural language processing,” Annual Review of Information Science and Technology, 37(1), 51-89. (DOI: 10.1002/aris.1440370103)
    Chollet, F. (2017). “What makes deep learning different,” Deep Learning with Python. (ISBN:9781617294433)
    Chen, T. and Guestrin, C. (2016). “XGBoost: a scalable tree boosting system,” Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794. (DOI:10.1145/2939672.2939785)
    Cortes, C. and Vapnik, V. (1995). “Support-vector networks,” Machine Learning, 20(3), 273-297. (DOI: 10.1007/BF00994018)
    Das, A. C., Mohanty, S. N., Prasad, A. G. and Swain, A. (2016). “A model for detecting and managing unrecognized data in a big data framework,” 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), 3517-3522. (DOI: 10.1109/ICEEOT.2016.7755358)
    Deng, L. and Yu, D. (2014). “Deep learning: methods and applications,” Foundations and Trends in Signal Processing, 7(3-4), 197-387. (DOI: 10.1561/2000000039)
    Deng, L., Hinton, G. and Kingsbury, B. (2013). “New types of deep neural network learning for speech recognition and related applications: an overview,” IEEE International Conference on Acoustics, Speech and Signal Processing, 8599-8603. (DOI: 10.1109/ICASSP.2013.6639344)
    Donga, W., Huang, Y., Lehane, B. and Ma, G. (2020). “XGBoost algorithm-based prediction of concrete electrical resistivity for structural health monitoring,” Automation in Construction, 114, 103155. (DOI: 10.1016/j.autcon.2020.103155)
    De'ath, G. and Fabricius, K. E. (2000). “Classification and regression trees: a powerful yet simple technique for ecological data analysis,” Ecological Society of America, 81(11), 3178-3192. (DOI: 10.1890/0012-9658(2000)081[3178:CARTAP]2.0.CO;2)
    Ellis, K., Kerr, J., Godbole, S., Lanckriet, G., Wing, D. and Marshall, S. (2014). “A random forest classifier for the prediction of energy expenditure and type of physical activity from wrist and hip accelerometers,” Physiological Measurement, 35(11), 2191 2203. (DOI: 10.1088/0967-3334/35/11/2191)
    Fukushima, K. (1980). “Neocognitron a self-organizing neural network for a mechanism of pattern recognition unaffected by shift in position,” Biological Cybernetics, 36(4), 193-202. (DOI: 10.1007/BF00344251)
    Gantz, J. and Reinsel, D. (2011) “Extracting value from chaos,” IDC’s Digital Universe Study, sponsored by EMC, Issu.1142, 1-12. (Access March 2, 2021 from: http://www.kushima.org/wp-content/uploads/2013/05/DigitalUniverse2011.pdf)
    Hubel, D. H. and Wiesel, T. N. (1962). “Receptive fields, binocular interaction and functional architecture in cat's visual cortex,”The Journal of Physiology, 160(1), 106-154 (DOI: 10.1113/jphysiol.1962.sp006837)
    Kelley, H. J. (1960). “Gradient theory of optimal flight paths,” ARS Journal, 30(10), 947-954. (DOI: 10.2514/8.5282)
    Karsoliya, S. (2012). “Approximating number of hidden layer neurons in multiple hidden layer BPNN architecture,” International Journal of Engineering Trends and Technology, 3(6). (ISSN: 2231-5381)
    Krizhevsky, A., Sutskever, I. and Hinton, G. E. (2017). “Imagenet classification with deep convolutional neural networks,” Communications of the ACMI. 60(6), 84-90. (DOI:10.1145/3065386)
    Kravchik, M. and Shabtai, A. (2018). “Detecting cyber attacks in industrial control systems using convolutional neural networks,” Cyber-Physical Systems Security and Privacy, 72-83. (DOI: 10.1145/3264888.3264896)
    Keller, J. M., Gray, M. R. and Givens, J. A. (1985). “A fuzzy K-nearest neighbor algorithm,” IEEE Transactions on Systems, Man, and Cybernetics, 15(4), 580-585. (DOI: 10.1109/TSMC.1985.6313426)
    Kantardzic, M. (2011). “Web Mining and Text Mining,” Data Mining, 300-327. (DOI: 10.1002/9781118029145.ch11)
    Laney, D. (2001). “3-D data management: controlling data volume, velocity, and variety,” File: 949 Addendum, META Group. (Access March 2, 2021 from: https://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf)
    LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). “Gradient-based learning applied to document recognition,” IEEE, 86(11), 2278-2324. (DOI: 10.1109/5.726791)
    LeCun, Y. and Bengio, Y. (1995). “Convolutional networks for images, speech, and time-series,” The Handbook of Brain Theory and Neural Networks, 255-258. (Access March 2, 2021 from: http://yann.lecun.com/exdb/publis/pdf/lecun-bengio-95a.pdf)
    Meijer, E. (2011). “The world according to LINQ,” Communications of the ACM, 54(10):45-51. (DOI: 10.1145/2001269.2001285)”
    Manyika, J., Chui M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C. and Byers, A. H. (2011). “Big data: the next frontier for innovation, competition, and productivity,” McKinsey Global Institute. (ISBN: 9780983179696)
    Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M. and Gao, J. (2021). “Deep learning based text classification: a comprehensive review,” ACM Computing Surveys, 54(3), 1-40. (DOI: 10.1145/3439726)
    Mehdiyev, N., Lahann, J., Emrich, A., Enke, D., Fettke, P. and Loos, P. (2017). “Time series classification using deep learning for process planning: a case from the process Industry,” Procedia Computer Science, 114, 242-249. (DOI: 10.1016/j.procs.2017.09.066)
    McCulloch, W. and Pitts, W. (1943). “A logical calculus of ideas immanent in nervous activity,” The Bulletin of Mathematical Biophysics, 5(4), 115-133. (DOI: 10.1007/BF02478259)
    Murata, K., Mito, M., Eguchi, D., Mori, Y. and Toyonaga, M. (2018). “A single filter CNN performance for basic shape classification. “International Conference on Awareness Science and Technology (iCAST), 139-143. (DOI: 10.1109/ICAwST.2018.8517219)
    Ma, X., Sha, J., Wang, D., Yu, Y., Yang, Q. and Niu, X. (2018). “Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGboost algorithms according to different high dimensional data cleaning,” Electronic Commerce Research and Applications, 31, 24-39. (DOI: 10.1016/j.elerap.2018.08.002)
    Montgomery, D. C., Peck, E. A. and Vining, G. G. (2013). Introduction to Linear Regression Analysis, 5th edition, Chapter 10. (ISBN:9781119578741)
    Niu, X. X. and Suen, C. Y. (2012). “A novel hybrid CNN–SVM classifier for recognizing handwritten digits,” Pattern Recognition, 45(4), 1318-1325. (DOI: 10.1016/j.patcog.2011.09.021)
    Nobre, J. and Neves, R. F. (2019). “Combining principal component analysis, discrete wavelet transforms and XGBoost to trade in the financial markets,” Expert Systems with Applications, 125, 181-194. (DOI: 10.1016/j.eswa.2019.01.083)
    Ogunleye, A. and Wang, Q. G. (2020). “XGBoost model for chronic kidney disease diagnosis,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, 17(6), 2131-2140. (DOI: 10.1109/TCBB.2019.2911071)
    Peterson, L. E. (2009). “K-nearest neighbor,” Scholarpedia, 4(2), 1883. (DOI: 10.4249/scholarpedia.1883)
    Paryasto, M., Alamsyah, A. and Rahardjo, B. (2014). “Big-data security management issues,” 2014 2nd International Conference on Information and Communication Technology (ICoICT), 59-63. (DOI: 10.1109/ICoICT.2014.6914040)
    Pandey, K. K. and Shukla, D. (2018). “Challenges of big data to big data mining with their processing framework,” 2018 8th International Conference on Communication Systems and Network Technologies, 89-94. (DOI: 10.1109/CSNT.2018.8820282)
    Pang, L., Wang, J., Zhao, L., Wang, C. and Zhan, H. (2019). “A novel protein subcellular localization method with CNN-XGBoost model for alzheimer's disease,” Frontiers in Genetics, 9. (DOI: 10.3389/fgene.2018.00751)
    Pan, B. Y. (2018) “Application of XGBoost algorithm in hourly PM2.5 concentration prediction,” IOP Conference Series: Earth and Environmental Science, 113, 8-10. (DOI: 10.1088/1755-1315/113/1/012127)
    Rumelhart, D. E., Hinton, G. E. and Williams, R. J. (1986). “Learning representations by back propagating errors,” Nature, 323, 533-536. (DOI: 10.1038/323533a0)
    Ren, X., Guo, H., Li, S., Wang, S. and Li, J. (2017). “A novel image classification method with CNN-XGBoost model,” In: Kraetzer C., Shi YQ., Dittmann J., Kim H. (eds) Digital Forensics and Watermarking. IWDW 2017. Lecture Notes in Computer Science, 10431, 378-390, Springer, Cham. (DOI: 10.1007/978-3-319-64185-0_28)
    Svetnik, V., Liaw, A., Tong, C., Culberson, J. C., Sheridan, R. P. and Feuston, B. P. (2003). “Random forest:  a classification and regression tool for compound classification and QSAR modeling,” Journal of Chemical Information and Computer Sciences, 43(6), 1947-1958. (DOI: 10.1021/ci034160g)
    Yu, W. and Zhao, C. (2020). “Broad convolutional neural network based industrial process fault diagnosis with incremental learning capability,” IEEE Transactions on Industrial Electronics, 67(6), 5081-5091. (DOI: 10.1109/TIE.2019.2931255)
    Zikopoulos, P. C., Eaton, C., deRoos, D., Deutsch, T. and Lapis, G. (2012). Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data, McGraw-Hill Osborne Media, 2012. (ISBN:9780071790536)
    Zeile, M. D. and Fergus, R. (2014). “Visualizing and understanding convolutional networks,” European Conference on Computer Vision (ECCV), 8689, 818-833. (DOI: 10.1007/978-3-319-10590-1_53)
    Zhang, Y., Miyamori, Y., Mikami, S. and Saito, T. (2019). “Vibration-based structural state identification by a 1-dimensional convolutional neural network,” Computer-Aided Civil and Infrastructure Engineering, 34(9), 822-839. (DOI: 10.1111/mice.12447)
    Zhao, J., Mao, X. and Chen, L. (2019). “Speech emotion recognition using deep 1D and 2D CNN LSTM networks,” Biomedical Signal Processing and Control, 47, 312 - 323. (DOI: 10.1016/j.bspc.2018.08.035)
    Zhang, X., Nguyen, H., Bui, X. N., Tran, Q. H., Nguyen, D. A., Bui, D. T. and Moayedi, H. (2020). “Novel soft computing model for predicting blast-induced ground vibration in open-pit mines based on particle swarm optimization and XGBoost,” Natural Resources Research, 29(2), 711-721. (DOI: 10.1007/s11053-019-09492-7)
    Zhang, D., Qian, L., Mao, B., Huang, C., Huang, B. and Si, Y. (2018). “A data-driven design for fault detection of wind turbines using random forests and XGboost,” IEEE Access, 6, 21020-21031. (DOI: 10.1109/ACCESS.2018.2818678)
    Zhu, Y., Weiyi, X. U., Luo, G., Wang, H., Yang, J. and Lu, W. (2020). “Random forest enhancement using improved artificial fish swarm for the medial knee contact force prediction,” Artificial Intelligence in Medicine, 103, 101811. (DOI: 10.1016/j.artmed.2020.101811)

    無法下載圖示
    全文公開日期 2024/06/21 (校外網路)
    全文公開日期 2024/06/21 (國家圖書館:臺灣博碩士論文系統)
    QR CODE