簡易檢索 / 詳目顯示

研究生: 林威志
Wei-Chih Lin
論文名稱: 結合輿情文字探勘與財經指標的股價漲跌預測模型
Stock Price Fluctuation Prediction by Opinion Mining and Financial Indicators
指導教授: 林希偉
Shi-Woei Lin
口試委員: 黃文曄
Wen-Yeh Huang
彭奕農
Yi-Nung Peng
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 48
中文關鍵詞: 文字探勘情緒分析主題分析股市漲跌預測情緒字典
外文關鍵詞: text mining, sentiment analysis, topic modeling, stock price prediction, sentiment dictionary
相關次數: 點閱:324下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 雖然效率市場假說強調股票的市場價格無法預測,行為財務理論卻指出效率市場假說未必合理,股票的漲跌走勢亦成為財金預測建模的重要議題。隨著資訊科技的進步,投資人的市場訊息來源亦更加多元,新聞及社群媒體所夾帶的大量市場訊息亦可能影響投資判斷,進而造成股市波動。本研究透過建立股市專用情緒字典以量測新聞文章或是網路輿論所蘊含的情緒,利用文字探勘之主題分析擷取文章或輿論背後的主題,並結合情緒和主題形成新的預測指標,最終納入財經及股市技術指標進行機器學習預測。本研究透過PTT Stock看板與台灣股市資料驗證本研究提出方法之有效性,實證結果指出,納入情緒及主題分析後,股票漲跌趨勢預測之正確率有所提升。本研究所提出之分析架構,包括自訂專用情緒字典並結合主題分析,可以更精確掌握文本中的重要面向,未來亦可延伸應用於其他領域,以提升預測及決策品質。


    Although efficient market hypothesis states that the asset prices reflect all available information and thus stock market fluctuation is unpredictable, behavioral finance takes a different viewpoint and consider that the stock prices can be influenced by psychological biases and behaviors. With the advancement of information technology, investors nowadays can access diversified sources of market information. The market information carried by news and social media thus may affect investment judgment and cause stock market fluctuation. In this study, we build a stock market sentiment dictionary to measure the sentiment in news articles or online opinions, use topic modeling to extract the topics behind the articles or opinions, and combine the sentiment associated with the extracted topics into new indicators, which are eventually incorporated with financial and stock market technical indicators for machine learning prediction. The effectiveness of the present study is verified by PTT Stock information and Taiwan stock market data. The results show that the combining of sentiment and topic analysis can improve the prediction accuracy. The analysis framework of the present study, which includes a customized sentiment dictionary and topic analysis, can be applied in other fields to master the important information in text, and improve the quality of prediction and decision making.

    摘要 I Abstract II 誌謝 III 目錄 IV 圖目錄 VII 表目錄 VIII 第一章 緒論 1 1.1研究背景與動機 1 1.2研究目的 3 1.3論文架構 4 第二章 文獻回顧 5 2.1行為財務學 5 2.2媒體及網路輿論對股市之影響 7 2.3文字探勘於金融產品價格預測之應用 8 2.4小結 11 第三章 研究方法 12 3.1情緒字典建構 13 3.2主題模型分析 15 3.3機器學習模型 18 3.3.1輸入變數 19 3.3.2羅吉斯迴歸 21 3.3.3決策樹 22 3.3.4隨機森林 22 3.3.5評估模型方法 23 第四章 研究分析及結果 25 4.1案例與案例資料說明 25 4.1.1資料預處理 26 4.2 情緒字典 27 4.2.1去除停用詞 27 4.2.2手動標記字典 28 4.2.3檢核字典效度 29 4.3主題分析 31 4.3.1結合主題以及情緒 34 4.4結合機器學習結果 36 4.4.1羅吉斯迴歸 36 4.4.2決策樹機器學習結果 39 4.4.3隨機森林機器學習結果 39 4.5小結 40 第五章 結論與建議 42 5.1結論 42 5.2研究貢獻 42 5.3研究限制與未來研究建議 43 參考資料 44

    英文文獻
    1. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716-723.
    2. Aman, H., & Moriyasu, H. (2022). Effect of corporate disclosure and press media on market liquidity: Evidence from Japan. International Review of Financial Analysis, 82, 102167.
    3. Aman, H., Kasuga, N., & Moriyasu, H. (2018). Mass media effects on trading activities: television broadcasting evidence from Japan. Applied Economics, 50(42), 4522-4539.
    4. Barber, B. M., & Odean, T. (2000). Trading is hazardous to your wealth: The common stock investment performance of individual investors. Journal of Finance, 55(2), 773-806.
    5. Barber, B. M., & Odean, T. (2008). All that glitters: The effect of attention and news on the buying behavior of individual and institutional investors. Review of Financial Studies, 21(2), 785-818.
    6. Bartov, E., Faurel, L., & Mohanram, P. S. (2018). Can Twitter help predict firm-level earnings and stock returns?. Accounting Review, 93(3), 25-57.
    7. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993-1022.
    8. Bushee, B. J., Core, J. E., Guay, W., & Hamm, S. J. (2010). The role of the business press as an information intermediary. Journal of Accounting Research, 48(1), 1-19.
    9. De Bondt, R., & Veugelers, R. (1991). Strategic investment with spillovers. European Journal of Political Economy, 7(3), 345-366.
    10. De Bondt, W. F., & Thaler, R. (1985). Does the stock market overreact?. Journal of Finance, 40(3), 793-805.
    11. De Bondt, W. F., & Thaler, R. H. (1995). Financial decision-making in markets and firms: A behavioral perspective. Handbooks in Operations Research and Management Science, 9, 385-410.
    12. De Fortuny, E. J., De Smedt, T., Martens, D., & Daelemans, W. (2014). Evaluating and understanding text-based stock price prediction models. Information Processing & Management, 50(2), 426-441.
    13. Fama, E. F. (1970). Efficient capital markets: A review of theory and empirical work. Journal of Finance, 25(2), 383-417.
    14. Fang, L., & Peress, J. (2009). Media coverage and the cross‐section of stock
    returns. Journal of Finance, 64(5), 2023-2052.
    15. Fisher, K. L., & Statman, M. (2000). Investor sentiment and stock returns. Financial Analysts Journal, 56(2), 16-23.
    16. Gidofalvi, G., & Elkan, C. (2001). Using news articles to predict stock price movements. Technical Report of Department of Computer Science and Engineering, University of California, San Diego.
    17. Glaser, M., & Weber, M. (2007). Overconfidence and trading volume. Geneva Risk and Insurance Review, 32(1), 1-36.
    18. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(1), 5228-5235.
    19. Groth, S. S., & Muntermann, J. (2011). An intraday market risk management approach based on textual analysis. Decision Support Systems, 50(4), 680-691.
    20. Hagenau, M., Liebmann, M., & Neumann, D. (2013). Automated news reading: Stock price prediction based on financial news using context-capturing features. Decision Support Systems, 55(3), 685-697.
    21. Kahneman, D., & Riepe, M. W. (1998). Aspects of investor psychology. Journal of Portfolio Management, 24(4), 67-91
    22. Kahneman, D., & Tversky, A. (1979). Prospect Theory: An Analysis of Decision under Risk. Econometrica, 47(2), 263-292.
    23. Kyle, A. S., & Wang, F. A. (1997). Speculation duopoly with agreement to disagree: Can overconfidence survive the market test?. Journal of Finance, 52(5), 2073-2090.
    24. Li, N., Chen, K., & He, H. (2022). UGC Knowledge Features and Their Influences on the Stock Market: An Empirical Study Based on Topic Modeling. Information, 13(10), 454.
    25. Li, Q., Wang, T., Li, P., Liu, L., Gong, Q., & Chen, Y. (2014). The effect of news and public mood on stock movements. Information Sciences, 278, 826-840.
    26. Li, X., Xie, H., Song, Y., Zhu, S., Li, Q., & Wang, F. L. (2015). Does summarization help stock prediction? A news impact analysis. IEEE Intelligent Systems, 30(3), 26-34.
    27. Mahmoudi, N., Docherty, P., & Moscato, P. (2018). Deep neural networks understand investors better. Decision Support Systems, 112, 23-34.
    28. McGurk, Z., Nowak, A., & Hall, J. C. (2020). Stock returns and investor sentiment: textual analysis and social media. Journal of Economics and Finance, 44(3), 458-485.
    29. McLaughlin, J. E., Lyons, K., Lupton-Smith, C., & Fuller, K. (2022). An introduction to text analytics for educators. Currents in Pharmacy Teaching and Learning, 14(10), 1319-1325

    30. Nguyen, T. H., Shirai, K., & Velcin, J. (2015). Sentiment analysis on social media for stock movement prediction. Expert Systems with Applications, 42(24), 9603-9611.
    31. Odean, T. (1998). Volume, volatility, price, and profit when all traders are above average. Journal of Finance, 53(6), 1887-1934.
    32. Picasso, A., Merello, S., Ma, Y., Oneto, L., & Cambria, E. (2019). Technical analysis and sentiment embeddings for market trend prediction. Expert Systems with Applications, 135, 60-70.
    33. Pröllochs, N., Feuerriegel, S., & Neumann, D. (2016). Negation scope detection in sentiment analysis: Decision support for news-driven trading. Decision Support Systems, 88, 67-75.
    34. Pudjihartono, N., Fadason, T., Kempa-Liehr, A. W., & O'Sullivan, J. M. (2022). A review of feature selection methods for machine learning-based disease risk prediction. Frontiers in Bioinformatics, 2, 927312.
    35. Ranco, G., Aleksovski, D., Caldarelli, G., Grčar, M., & Mozetič, I. (2015). The effects of Twitter sentiment on stock price returns. PloS One, 10(9), e0138441.
    36. Rao, T., & Srivastava, S. (2012). Analyzing stock market movements using twitter sentiment analysis. In Proceedings of ASONAM, August 26-29, Istanbul, Turkey.
    37. Roberts, M. E., Stewart, B. M., Tingley, D., Lucas, C., Leder‐Luis, J., Gadarian, S. K., Albertson, B., & Rand, D. G. (2014). Structural topic models for open‐ended survey responses. American Journal of Political Science, 58(4), 1064-1082.
    38. Sarker, I. H. (2021). Machine learning: Algorithms, real-world applications and research directions. SN Computer Science, 2(3), 1-21.
    39. Schumaker, R. P., & Chen, H. (2009). A quantitative stock prediction system based on financial news. Information Processing & Management, 45(5), 571-583.
    40. Schumaker, R. P., Zhang, Y., Huang, C. N., & Chen, H. (2012). Evaluating sentiment in financial news articles. Decision Support Systems, 53(3), 458-464.
    41. Shiller, R. J. (2000). Measuring bubble expectations and investor confidence. Journal of Psychology and Financial Markets, 1(1), 49-60.
    42. Shiller, R. J., Fischer, S., & Friedman, B. M. (1984). Stock prices and social dynamics. Brookings papers on economic activity, 1984(2), 457-510.
    43. Shleifer, A. (2000). Inefficient markets: An Introduction to Behavioural Finance. Oxford University Press, U.K.
    44. Shynkevich, Y., McGinnity, T. M., Coleman, S. A., & Belatreche, A. (2016). Forecasting movements of health-care stock prices based on different categories of news articles using multiple kernel learning. Decision Support Systems, 85,
    74-83.
    45. Slovic, P., & Lichtenstein, S. (1971). Comparison of Bayesian and regression approaches to the study of information processing in judgment. Organizational Behavior and Human Performance, 6(6), 649-744.
    46. Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. Journal of Finance, 62(3), 1139-1168.
    47. Tetlock, P. C. (2010). Does public financial news resolve asymmetric information?. Review of Financial Studies, 23(9), 3520-3557.
    48. Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5(2), 207-232.
    49. Tversky, A., & Kahneman, D. (1974). Judgment under Uncertainty: Heuristics and Biases: Biases in judgments reveal some heuristics of thinking under uncertainty. Science, 185(4157), 1124-1131.
    50. Vayansky, I., & Kumar, S. A. (2020). A review of topic modeling methods. Information Systems, 94, 101582.
    51. Von Neumann, J., & Morgenstern, O. (1947). Theory of Games and Economic Behavior, 2nd rev. Princeton University Press. U.S.
    52. Yang, S. Y., Mo, S. Y. K., & Liu, A. (2015). Twitter financial community sentiment and its predictive relationship to stock market movement. Quantitative Finance, 15(10), 1637-1656.
    53. Zhao, W., Wu, F., Fu, Z., Wang, Z., & Zhang, X. (2020). Sentiment analysis on weibo platform for stock prediction. In Proceedings of International Conference on Artificial Intelligence and Security, July 17-20, Hohhot, China.

    中文文獻
    1. 林美珍(2012)。行為財務學。臺北市:華泰。

    網路資料
    1. Loughran, T., & McDonald, B. (2011). Loughran-McDonald Master Dictionary w/ Sentiment Word Lists. Retrieved from : www.sraf.nd.edu/loughranmcdonald-master-dictionary/
    2. 國立台灣大學自然語言處理實驗室(2007)。NTU Sentiment Dictionary。檢自:www.rdrr.io/rforge/tmcn/man/NTUSD.html
    3. 维基百科(2022)。批踢踢。檢自: www.zh.wikipedia.org/w/index.php?title=%E6%89%B9%E8%B8%A2%E8%B8%A2&oldid=74216336

    4. 程式人生(2017)。檢自:www.796t.com/content/1494661154.html

    無法下載圖示 全文公開日期 2025/01/25 (校內網路)
    全文公開日期 2025/01/25 (校外網路)
    全文公開日期 2025/01/25 (國家圖書館:臺灣博碩士論文系統)
    QR CODE