簡易檢索 / 詳目顯示

研究生: 潘柏瑞
Po-Jui Pan
論文名稱: 結合BERT新聞模型與財報數據之企業信用風險指標預測模型
Prediction of Credit Risk Index Using BERT News Model and Financial Report
指導教授: 呂永和
Yung-Ho Leu
口試委員: 楊維寧
陳雲岫
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 46
中文關鍵詞: Financial distressNLPBERTpre-trained language model
外文關鍵詞: Financial distress, NLP, BERT, pre-trained language model
相關次數: 點閱:270下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Before the advancement of natural language processing, companies and organizations around the world stored many text documents for decades. Nowadays, the teams at these groups all want to make good use of these old text files with machine learning processing.
    Because of the downturn of the global economy, many companies have encountered financial distress which would lead them to the edge of bankruptcy. Companies who are distress might borrow money from banks for maintaining enough cash flow. Nevertheless, if they can not fix the problem, they would always face bankruptcy. In the end, their debts are too much that they can not repay and no banks would lend them money anymore. In order to avoid this situation, our goal is to predict companies’ financial distress by using several critical financial variables and our self-defined features: risk probability.
    Since the Bidirectional Encoder Representations from Transformers (BERT) was launched at the end of 2018, it has changed all thinking in the field of natural language progressing. From classification, multiple choices to question-answering problems, most of the NLP tasks had great improvement by using BERT. Therefore, our risk probabilities are generated by some news about specific companies from a BERT model, trained from eLand Risk News Dataset. The eLand Risk News Dataset which contains 5 kinds of risk labels help us to train our first BERT model to classify news into 5 risk classes. With the BERT model’s help, we could extract the 5 risk probabilities from news data and concatenate them with financial variables to train our second TCRI model to predict TCRI levels. TCRI is a risk level that over 90% of banks refer to it for understanding the financial circumstances of their clients.


    Before the advancement of natural language processing, companies and organizations around the world stored many text documents for decades. Nowadays, the teams at these groups all want to make good use of these old text files with machine learning processing.
    Because of the downturn of the global economy, many companies have encountered financial distress which would lead them to the edge of bankruptcy. Companies who are distress might borrow money from banks for maintaining enough cash flow. Nevertheless, if they can not fix the problem, they would always face bankruptcy. In the end, their debts are too much that they can not repay and no banks would lend them money anymore. In order to avoid this situation, our goal is to predict companies’ financial distress by using several critical financial variables and our self-defined features: risk probability.
    Since the Bidirectional Encoder Representations from Transformers (BERT) was launched at the end of 2018, it has changed all thinking in the field of natural language progressing. From classification, multiple choices to question-answering problems, most of the NLP tasks had great improvement by using BERT. Therefore, our risk probabilities are generated by some news about specific companies from a BERT model, trained from eLand Risk News Dataset. The eLand Risk News Dataset which contains 5 kinds of risk labels help us to train our first BERT model to classify news into 5 risk classes. With the BERT model’s help, we could extract the 5 risk probabilities from news data and concatenate them with financial variables to train our second TCRI model to predict TCRI levels. TCRI is a risk level that over 90% of banks refer to it for understanding the financial circumstances of their clients.

    ABSTRACT i ACKNOWLEDGEMENT ii TABLE OF CONTENTS iii LIST OF FIGURES v LIST OF TABLES vi Chapter 1 Introduction 1 1.1 RESEARCH BACKGROUND 1 1.2 RESEARCH MOTIVATION 1 1.3 RESEARCH PURPOSE 2 1.4 RESEARCH OVERVIEW 2 Chapter 2 Related Work 4 2.1 IMBALANCED CLASS 4 2.1.1 Metric Methods 5 2.1.2 Oversampling and Undersampling 5 2.1.3 Generate Samples 6 2.2 FINANCIAL DISTRESS 6 2.3 TAIWAN CORPORATE CREDIT INDEX 7 2.4 NATURAL LANGUAGE PROCESSING BEFORE BERT 10 2.5 BERT - BIDIRECTIONAL ENCODER REPRESENTATION FROM TRANSFORMER 11 2.5.1 Attention and Transformer 11 2.5.2 Pre-trained Language Model and Fine-Tuning 12 Chapter 3 Research Method 16 3.1 EXPERIMENT FLOW 16 3.2 DATASET DESCRIPTION 17 3.2.1 eLand Risk Dataset 17 3.2.2 TEJ News Dataset 18 3.2.3 TCRI Annual Dataset 18 3.3 BERT TOKENIZING AND TEXT PREPROCESSING 18 3.4 BERT TRAINING 20 3.4.1 Stratify Sampling 20 3.4.2 Training the BERT Risks Classification Model and Save Model 20 3.5 RISK SCORE CALCULATION 21 3.6 TCRI PREDICTION 22 3.7 EVALUATION METRICS 22 3.7.1 Confusion Matrix 22 3.7.2 Evaluation scores 24 Chapter 4 Experiment Results 25 4.1 EXPERIMENTAL ENVIRONMENT 25 4.2 PARAMETERS SETTING 25 4.3 BERT MODELS RESULTS 28 4.4 TCRI MODEL RESULTS 29 Chapter 5 Conclusion and Future Research 33 5.1 CONCLUSION 33 5.2 FUTURE RESEARCH 34 Reference 35

    [1] Davis, J., & Goadrich, M. (2006, June). The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd international conference on Machine learning (pp. 233-240). ACM.
    [2] Drummond, C., & Holte, R. C. (2003, August). C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In Workshop on learning from imbalanced datasets II (Vol. 11, pp. 1-8). Washington, DC: Citeseer.
    [3] Barandela, R., Valdovinos, R. M., Sánchez, J. S., & Ferri, F. J. (2004, August). The imbalanced training sample problem: Under or over sampling?. In Joint IAPR international workshops on statistical techniques in pattern recognition (SPR) and structural and syntactic pattern recognition (SSPR) (pp. 806-814). Springer, Berlin, Heidelberg.
    [4] Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321-357.
    [5] Beaver, W. H. (1966). Financial ratios as predictors of failure. Journal of accounting research, 71-111.
    [6] Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The journal of finance, 23(4), 589-609.
    [7] Deakin, E. B. (1972). A discriminant analysis of predictors of business failure. Journal of accounting research, 167-179.
    [8] Blum, M. (1974). Failing company discriminant analysis. Journal of accounting research, 1-25.
    [9] Ohlson, J. A. (1980). Financial ratios and the probabilistic prediction of bankruptcy. Journal of accounting research, 109-131.
    [10] Zmijewski, M. E. (1984). Methodological issues related to the estimation of financial distress prediction models. Journal of Accounting research, 59-82.
    [11] TEJ, TEJ 台灣經濟新報文化事業股份有限公司 Retrieved from https://www.tej.com.tw
    [12] TEJ, TEJ TCRI 台灣企業信用風險指標 Retrieved from https://www.tej.com.tw/tcri
    [13] Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
    [14] Luong, M. T., Pham, H., & Manning, C. D. (2015). Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025.
    [15] Gehring, J., Auli, M., Grangier, D., Yarats, D., & Dauphin, Y. N. (2017, August). Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 (pp. 1243-1252). JMLR. org.
    [16] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
    [17] Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. arXiv preprint arXiv:1802.05365.
    [18] Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. URL https://s3-us-west-2. amazonaws. com/openai-assets/researchcovers/languageunsupervised/language understanding paper. pdf.
    [19] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
    [20] Friedman, J. H. (1991). Multivariate adaptive regression splines. The annals of statistics, 19(1), 1-67.
    [21] Lample, G., & Conneau, A. (2019). Cross-lingual language model pretraining. arXiv preprint arXiv:1901.07291.
    [22] Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., & Le, Q. V. (2019). XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv preprint arXiv:1906.08237.
    [23] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., ... & Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
    [24] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., ... & Liu, P. J. (2019). Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683.
    [25] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., & Soricut, R. (2019). Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942.

    無法下載圖示 全文公開日期 2025/01/17 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE