簡易檢索 / 詳目顯示

研究生: 王勁凱
Jin-Kai Wang
論文名稱: 基植於卷積神經網路的多進多出大數據預測分析於股票收盤價格應用
The Multiple Input Multiple Output Big Data Predictive Analytics for Stock Prices based on the Convolutional Neural Network Models
指導教授: 羅士哲
Shih-Che Lo
口試委員: 蔡鴻旭
Hung-Hsu Tsai
曹譽鐘
Yu-Chung Tsao
羅士哲
Shih-Che Lo
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 53
中文關鍵詞: 大數據預測分析股票價格預測指數平滑法卷積神經網路支持向 量迴歸
外文關鍵詞: Big Data Predictive Analytics, Stock Price Prediction, Exponential Smoothing, Convolution Neural Network, Support Vector Regression
相關次數: 點閱:291下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 由於大數據分析的興起,越來越多學者提出使用大數據分析的工具來應用在工業上或是商業上的問題來增加利潤和降低成本。其中商業大數據的問題又包括了很多種的課題像是財務狀況審核、行銷策略、房地產價值預測和股票收盤價格預測。
    股票市場是一個具有高度不確性、波動性以及各種預期心理的集中市場,想要準確預測股票價格是一個很難的課題,所以自古以來有非常多的學者提出了很多的模型和理論來預測股票價格。許多學者應用了大數據分析裡的財務模型、機器學習和深度學習來預測股票價格,多數的模型為單一輸入單一輸出的模型(1-1)或是多輸入單一輸出的模型(m-1),但據我們所知,目前幾乎沒有多輸入多輸出的模型(m-m)並且應用在股票預測上,所以我們在本論文提出基植於卷積神經網路的多進多出模型來一次預測多支股票的收盤價格。
    我們使用Holt-Winter 指數平滑法以及支持向量迴歸模型來當作我們要比較的基準,並且會使用RMSE、MAE 以及MAPE 來當作我們預測準確率的衡量指標,在建立預測模型的同時,我們也會應用實驗設計的方法來調整我們模型的參數,以及使用滑動窗格的技巧來提升模型的準確率。我們蒐集了2 種指數與3 檔股票分別為加權指數、上櫃指數、臺灣積體電路(2330)、鴻海精密工業(2317)以及中華電信(2412)一共19 年的歷史資料進行實驗。我們將資料的前90%的資料做為訓練資料集用來訓練模型,而最後的10%的資料做為測試資料集。預測結果顯示支持向量回歸模型的準確度最好,其次才是我們提出的基植於卷積神經網路的多進多出模型,最後則是Holt-Winter 指數平滑法,也代表本論文所提出多輸入多輸出的模型是可用於大數據預測分析的。


    Due to the rise of big data predictive analytics, more and more scholars have
    proposed to use big data predictive analytics tools to apply on industrial or commercial issues to increase profits and reduce costs. Among them, the problem of commercial big data application includes many topics such as financial status review, marketing strategy, real estate value forecasting and stock closing price forecasting. The stock market is a centralized market with high uncertainty, volatility, and various expectations. It is a difficult task to predict the stock price accurately, so many scholars have proposed various models and theories to predict stocks for many years. Most of them have applied into financial models, machine learning and deep learning in big data analytics to predict stock prices. To our knowledge, most models are designed as single input single output (1-1) models or multiple input single output (m-1) models, only few of them use multiple input multiple output (m-m) models to predict stock prices. Therefore, we proposed a multiple input multiple output 1D CNN model in this thesis to predict many stock closing prices at once.
    We use Holt-Winter exponential smoothing method and Support Vector Regression as the benchmarks, and use RMSE, MAE and MAPE as the performance measurement indicators for our prediction. While building the prediction models, we also apply Designed of Experiment to adjust the parameters of models and utilize the sliding window skill to increase forecasting accuracy. We totally collect 2 indexes and 3 stocks which are Taiwan Stock Exchange (TWSE) and Taipei Exchange (TPEX), including Taiwan Capitalization Weighted Stock Index (TAIEX), TPEX Exchange Capitalization Weighted Stock Index (TPEX), Taiwan Semiconductor Manufacturing (2330), Hon Hai Precision Ind. Co., Ltd. (2317) and Chunghwa Telecom Co., Ltd. (2412) historical data for 4713 days in our experiment. We use 90% of the data as training dataset to train the model and the remaining 10% of the data as testing dataset. The prediction results show that SVR has the best accuracy, follow by our proposed multiple input multiple output 1D CNN and finally Holt-Winter exponential smoothing method. The results show that our proposed multiple input multiple output 1D CNN model in this thesis is feasible for Big Data predictive analytics.

    摘要................................................................................................................................ I ABSTRACT ................................................................................................................. II ACKNOWLEDGEMENTS ..................................................................................... III FIGURES ..................................................................................................................... V TABLES ..................................................................................................................... VI CHAPTER 1 INTRODUCTION ................................................................................ 1 1.1 Research Motivation ............................................................................................ 1 1.2 Research Objectives ............................................................................................. 2 1.3 Research Structure ............................................................................................... 2 CHAPTER 2 LITERATURE REVIEW .................................................................... 4 2.1 Big Data ............................................................................................................... 4 2.2 Financial Forecasting Models .............................................................................. 5 2.3 Machine Learning Models ................................................................................... 7 2.4 Artificial Neural Networks ................................................................................... 9 2.5 Convolution Neural Networks ........................................................................... 10 CHAPTER 3 RESEARCH METHODOLOGY ...................................................... 13 3.1 Big Data Analytics ............................................................................................. 13 3.2 Forecast Stock Prices Flow Chart ...................................................................... 14 3.3 Holt–Winters Exponential Smoothing ............................................................... 17 3.4 Support Vector Regression ................................................................................. 18 3.5 Convolution Neural Network ............................................................................. 19 3.6 Sliding Window .................................................................................................. 23 3.7 Forecasting Performance Measurement ............................................................. 24 CHAPTER 4 COMPUTATIONAL EXPERIMENTS ............................................ 25 4.1 Data Type, Preprocessing, Models Application and DOE ................................. 25 4.2 The Forecasting Results of Stock Prices and Discussion................................... 36 CHAPTER 5 CONCLUSIONS AND FUTURE RESEARCH............................... 39 5.1 Conclusions ........................................................................................................ 39 5.2 Future Research .................................................................................................. 39 REFERENCE ............................................................................................................. 40

    Ai, P., & Yue, Z. X. (2014). A framework for processing water resources big data and application. Applied Mechanics and Materials, 519, 3-8. Trans Tech Publications Ltd. (DOI: 10.4028/www.scientific.net/AMM.519-520.3)
    Benvenuto, D., Giovanetti, M., Vassallo, L., Angeletti, S., & Ciccozzi, M. (2020). Application of the ARIMA model on the COVID-2019 epidemic dataset. Data in Brief, 29, 105340. (DOI: 10.1016/j.dib.2020.105340)
    Chen, F. L., Chen, Y. C., & Kuo, J. Y. (2010). Applying Moving back-propagation neural network and Moving fuzzy-neuron network to predict the requirement of critical spare parts. Expert Systems with Applications, 37(9), 6695-6704. (DOI: 10.1016/j.eswa.2010.04.037)
    Chen, N., Xiong, C., Du, W., Wang, C., Lin, X., & Chen, Z. (2019). An improved genetic algorithm coupling a back-propagation neural network model (IGA-BPNN) for water-level predictions. Water, 11(9), 1795. (DOI: 10.3390/w11091795)
    Chen, P. T., Lin, C. L., & Wu, W. N. (2020). Big data management in healthcare: Adoption challenges and implications. International Journal of Information Management, 53, 102078. (DOI: 10.1016/j.ijinfomgt.2020.102078)
    Chen, S., Min, W., & Chen, R. (2013). Model identification for time series with dependent innovations. Statistica Sinica, 873-899. (DOI: 10.5705/ss.2010.291)
    Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297. (DOI: 10.1007/BF00994018)
    Devi, B. U., Sundar, D., & Alli, P. (2013). An effective time series analysis for stock trend prediction using ARIMA model for nifty midcap-50. International Journal of Data Mining & Knowledge Management Process, 3(1), 65-78. (DOI: 10.5121/ijdkp.2013.3106)
    Dong, W., Huang, Y., Lehane, B., & Ma, G. (2020). XGBoost algorithm-based prediction of concrete electrical resistivity for structural health monitoring. Automation in Construction, 114, 103155. (DOI: 10.1016/j.autcon.2020.103155)
    Ellis, K., Kerr, J., Godbole, S., Lanckriet, G., Wing, D., & Marshall, S. (2014). A random forest classifier for the prediction of energy expenditure and type of physical activity from wrist and hip accelerometers. Physiological Measurement, 35(11), 2191. (DOI: 10.1088/0967-3334/35/11/2191)
    Fan, W., & Bifet, A. (2013). Mining big data: current status, and forecast to the future. ACM SIGKDD Explorations Newsletter, 14(2), 1-5. (DOI: 10.1145/2481244.2481246)
    Fukushima, K. (1980). Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193-202. (DOI: 10.1007/BF00344251)
    Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., & Brilliant, L. (2009). Detecting influenza epidemics using search engine query data. Nature, 457(7232), 1012-1014. (DOI: 10.1038/nature07634)
    Gunduz, H., Yaslan, Y., & Cataltepe, Z. (2017). Intraday prediction of Borsa Istanbul using convolutional neural networks and feature correlations. Knowledge-Based Systems, 137(C), 138-148. (DOI: 10.1016/j.knosys.2017.09.023)
    Hong, L. (2011). Decomposition and forecast for financial time series with high-frequency based on empirical mode decomposition. Energy Procedia, 5, 1333-1340. (DOI: 10.1016/j.egypro.2011.03.231)
    Hoseinzade, E., & Haratizadeh, S. (2019). CNNpred: CNN-based stock market prediction using a diverse set of variables. Expert Systems with Applications, 129, 273-285. (DOI: 10.1016/j.eswa.2019.03.029)
    Hsieh, L.-F., Hsieh, S.-C., & Tai, P.-H. (2011). Enhanced stock price variation prediction via DOE and BPNN-based optimization. Expert Systems with Applications, 38(11), 14178-14184. (DOI: 10.1016/j.eswa.2011.04.229)
    Kang, G., Liu, K., Hou, B., & Zhang, N. (2017). 3D multi-view convolutional neural networks for lung nodule classification. PloS ONE, 12(11): e0188290. (DOI: 10.1371/journal.pone.0188290)
    Kara, Y., Boyacioglu, M. A., & Baykan, Ö. K. (2011). Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange. Expert Systems with Applications, 38(5), 5311-5319. (DOI: 10.1016/j.eswa.2010.10.027)
    Khan M. A., Uddin, M. F., & Gupta, N. (2014). Seven V's of Big Data understanding Big Data to extract value. Proceedings of the 2014 Zone 1 Conference of the American Society for Engineering Education, 1-5. (DOI: 10.1109/ASEEZone1.2014.6820689)
    Khashei, M., & Hajirahimi, Z. (2019). A comparative study of series arima/mlp hybrid models for stock price forecasting. Communications in Statistics-Simulation and Computation, 48(9), 2625-2640. (DOI: 10.1080/03610918.2018.1458138)
    Kim, T. Y., & Cho, S. B. (2019). Predicting residential energy consumption using CNN-LSTM neural networks. Energy, 182, 72-81. (DOI: 10.1016/j.energy.2019.05.230)
    Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25, 1-9. Retrieved from http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
    Laney, D. (2001). 3D Data Management: Controlling Data Volume, Velocity and Variety. Gartner. Application Delivery Strategies by META Group Inc., Retrieved from http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf
    LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. (DOI: 10.1109/5.726791)
    Li, F. F., Wang, Z. Y., & Qiu, J. (2019). Long-term streamflow forecasting using artificial neural network based on preprocessing technique. Journal of Forecasting, 38(3), 192-206. (DOI: 10.1002/for.2564)
    Lozada, N., Arias-Pérez, J., & Perdomo-Charry, G. (2019). Big Data analytics capability and co-innovation: An empirical study. Heliyon, 5(10): e02541. (DOI: 10.1016/j.heliyon.2019.e02541)
    Murata, K., Mito, M., Eguchi, D., Mori, Y., & Toyonaga, M. (2018). A single filter CNN performance for basic shape classification. In 2018 9th International Conference on Awareness Science and Technology (iCAST), 139-143, IEEE. (DOI: 10.1109/ICAwST.2018.8517219)
    Pai, P. F., & Lin, C. S. (2005). A hybrid ARIMA and support vector machines model in stock price forecasting. Omega, 33(6), 497-505. (DOI: 10.1016/j.omega.2004.07.024)
    Sahal, R., Breslin, J. G., & Ali, M. I. (2020). Big Data and stream processing platforms for Industry 4.0 requirements mapping for a predictive maintenance use case. Journal of Manufacturing Systems, 54, 138-151. (DOI: 10.1016/j.jmsy.2019.11.004)
    Said, S. E., & Dickey, D. A. (1984). Testing for unit roots in autoregressive-moving average models of unknown order. Biometrika, 71(3), 599-607. (DOI: 10.1093/biomet/71.3.599)
    Selvin, S., Vinayakumar, R., Gopalakrishnan, E. A., Menon, V. K., & Soman, K. P. (2017). Stock price prediction using LSTM, RNN and CNN-sliding window model. In 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 1643-1647. IEEE. (DOI: 10.1109/ICACCI.2017.8126078)
    Shao, J., & Zhong, B. (2003). Last observation carry-forward and last observation analysis. Statistics in Medicine, 22(15), 2429-2441. (DOI: 10.1002/sim.1519)
    Song, K., Yan, F., Ding, T., Gao, L., & Lu, S. (2020). A steel property optimization model based on the XGBoost algorithm and improved PSO. Computational Materials Science, 174, 109472. (DOI: 10.1016/j.commatsci.2019.109472)
    Sun, S., Wei, Y., Tsui, K. L., & Wang, S. (2019). Forecasting tourist arrivals with machine learning and internet search index. Tourism Management, 70, 1-10. (DOI: 10.1016/j.tourman.2018.07.010)
    Tan, Z., Yan, Z., & Zhu, G. (2019). Stock selection with random forest: An exploitation of excess return in the Chinese stock market. Heliyon, 5(8): e02310. (DOI: 10.1016/j.heliyon.2019.e02310)
    Tsaih, R. H., Kuo, B. S., Lin, T. H., & Hsu, C. C. (2018). The use of Big Data analytics to predict the foreign exchange rate based on public media: A machine-learning experiment. IT Professional, 20(2), 34-41. (DOI: 10.1109/MITP.2018.021921649)
    Tsantekidis, A., Passalis, N., Tefas, A., Kanniainen, J., Gabbouj, M., & Iosifidis, A. (2017). Forecasting stock prices from the limit order book using convolutional neural networks. In 2017 IEEE 19th Conference on Business Informatics (CBI), 1, 7-12, IEEE. (DOI: 10.1109/CBI.2017.23)
    Wang, J. Z., Wang, J. J., Zhang, Z. G., & Guo, S. P. (2011). Forecasting stock indices with back propagation neural network. Expert Systems with Applications, 38(11), 14346-14355. (DOI: 10.1016/j.eswa.2011.04.222)
    Wan, S., & Goudos, S. (2020). Faster R-CNN for multi-class fruit detection using a robotic vision system. Computer Networks, 168, 107036. (DOI: 10.1016/j.comnet.2019.107036)
    Weng, B., Lu, L., Wang, X., Megahed, F. M., & Martinez, W. (2018). Predicting short-term stock prices using ensemble methods and online data sources. Expert Systems with Applications, 112, 258-273. (DOI: 10.1016/j.eswa.2018.06.016)
    Wiesel, T. N., & Hubel, D. H. (1963). Single-cell responses in striate cortex of kittens deprived of vision in one eye. Journal of Neurophysiology, 26(6), 1003-1017. (DOI: 10.1152/jn.1963.26.6.1003)
    Xu, Z., Zhao, J., Yu, Y., & Zeng, H. (2020). Improved 1D-CNNs for behavior recognition using wearable sensor network. Computer Communications, 151, 165-171. (DOI: 10.1016/j.comcom.2020.01.012)
    Yan, H., Zhang, J., Rahman, S. S., Zhou, N., & Suo, Y. (2020). Predicting permeability changes with injecting CO2 in coal seams during CO2 geological sequestration: A comparative study among six SVM-based hybrid models. Science of The Total Environment, 705, 135941. (DOI: 10.1016/j.scitotenv.2019.135941)
    Zeiler, M. D., & Fergus R. (2014). Visualizing and understanding convolutional networks. In: Fleet D., Pajdla T., Schiele B., Tuytelaars T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, 8689, 818-833. Springer, Cham. (DOI: 10.1007/978-3-319-10590-1_53)
    Zhang Y., & Li Q. (2020). A regressive convolution neural network and support vector regression model for electricity consumption forecasting. In: Arai K., Bhatia R. (eds) Advances in Information and Communication. FICC 2019. Lecture Notes in Networks and Systems, 70, 33-45. Springer, Cham. (DOI: 10.1007/978-3-030-12385-7_4)
    Zhu, Y., Weiyi, X. U., Luo, G., Wang, H., Yang, J., & Lu, W. (2020). Random Forest enhancement using improved Artificial Fish Swarm for the medial knee contact force prediction. Artificial Intelligence in Medicine, 103, 101811. (DOI: 10.1016/j.artmed.2020.101811)

    無法下載圖示 全文公開日期 2025/07/14 (校內網路)
    全文公開日期 2025/07/14 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE