簡易檢索 / 詳目顯示

研究生: 王忠引
Zhong-yin Wang
論文名稱: 資料不對稱情況下應用資料探勘技術建構消費者貸款信用評估模型
Credit valuation modeling and evaluation using data mining for consumer loan market with skewed data
指導教授: 李強笙
Chiang-Sheng Lee
王孔政
Kung-jeng Wang
口試委員: 褚志鵬
none
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 71
中文關鍵詞: 多專家分類器決策樹不對稱資料消費者貸款信用評估資料探勘
外文關鍵詞: Decision tree, Skewed data, Consumer loans, Credit valuation, Data mining, Multi-classifier committee approach
相關次數: 點閱:276下載:13
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 金融環境面臨的高度競爭,為提高授信的品質及降低風險,本研究使用銀行借款客戶資料,應用羅吉斯迴歸、區別分析、倒傳遞類神經及決策樹等四種資料探勘技術來建構消費者貸款評估模型。以正常繳款客戶被正確分類、違約客戶被正確分類及總正確分類比例等三種指標來探討各技術之分類效果。本研究顯示,決策樹的分類效果優於其他三種分類技術。此外,在信心水準低時,受到不對稱資料影響,降低對於違約客戶的分類效果,故此,本研究使用多專家分類器來解決此資料不對稱問題,使用信心水準及門檻值兩種主要的參數來調整使用多專家分類器之後的決策樹模型,研究顯示:多專家分類器在適當的信心水準及門檻值下能成功的解決不對稱資料問題並且分類效果佳。


    The financial environment is facing high competition in Taiwan. In order to raise the quality of making loans and reduce the risk of making loans, we develop a set of credit valuation models using data mining techniques for consumer loan market with skewed data. Four data mining techniques were employed in the study, including Logistic regression, Discriminant analysis, Back-propagation artificial neural network (BPANN) and Decision tree (Using C4.5 algorithm). The performance indexes include the classification rate for normal and default payment as well as the overall classification rate. Experiment outcomes conclude that decision tree outperforms the others. In order to tackle the skewness of data, a multi-classifier committee approach is further proposed in the study. Two major parameters (i.e., confidence factor and threshold) are used to tune the naïve complete decision tree model so that the prediction accuracy can be maintained while the information burden reduced. This multi-classifier committee approach deals successfully with the skewness of data and reaches a high classification rate under a proper threshold and confidence factor setting.

    中文摘要I AbstractII AcknowledgementsIII Lists of FiguresV List of TablesVI Chapter 1 Introduction1 1.1 Research background and motivation1 1.2 Research Purpose2 1.3 Research process and framework3 Chapter 2 Literature Review5 2.1 Consumer loans5 2.1.1 Meaning of consumer loans5 2.1.2 Analysis of current situation about consumer loans5 2.2 Methods of credit risk evaluation7 2.3 Data mining10 2.4 Methods of solution to skewed data problem12 2.5 Relevant literature about consumer loans13 2.6 Summary15 Chapter 3 Modeling of credit valuation for consumer loan16 3.1 Research objective of this and data source16 3.2 Variables selection16 3.3 Results of sample analysis18 3.4 Modeling of Logistic Regression29 3.5 Modeling of discriminant analysis34 3.5.1 Prior tests of discriminant analysis34 3.5.2 Multivariate discriminant analysis model36 3.6 Modeling of Back-Propagation Neural Network39 3.7 Modeling of decision tree42 3.9 Summary44 Chapter 4 Using multi-classifier committee approach for skewed-data46 4.1 Multi-classifier committee approach for low confidence factor46 4.1.1 Selection of the ratio of MI to MA46 4.1.2 Developing a multi-classifier committee model48 4.2 Multi-classifier committee approach for high confidence factor51 4.2.1 Selection of the ratio of MI to MA51 4.2.2 Developing multi-classifier committee52 4.3 Summary54 Chapter 5 Conclusions55 Reference57 Appendix 1 Multi-classifier committee approach61 Appendix 2 Multi-classifier committee approach (Confidence factor=0.25)62

    Berger, A. N. and Udell, G. F., “Collateral, Loan Quality, and Bank Risk Exposure”, Journal of Monetary Economics, Vol.25, p.p.21-42, 1990.
    Chan, P. K., Fan, W., Prodromidis, A. L., and Stolfo, S. J., “Distributed Data Mining in Credit Card Fraud Detection”, IEEE Intelligent Systems, Vol. 14, No. 6, pp.67-74, 1999.
    Chang, Y.C., “Study of Credit Risks on Retail Assets for Domestic Commercial Bank-Taking Consumer Loan for example”, Soochow University, Institute of Economics,2006.
    Chen, S. H., “Exploit Data Mining Techniques to build Financial Crisis Prediction Models– Using Financial and Non Financial Indicators”, Chung Yuan Christian University, Institute of Information Management, 2004.
    Chen, T. R., “The Study on the Default Risk Factor for Consumer Credit Loan-A Case of F Bank”, Feng Chia University, School of Management Development, 2005.
    Chiang, K. L., “A Data Mining Application for Personal Loan”, National Taiwan University of Science Technology, Institute of Computer Science and Information Engineering, 2005.
    Chien, C. H., “A Study of the Credit Risk Evaluation Model and Loan Pricing Strategy on Personal Loan-an example of a domestic Bank”, National Dong Hwa University EMBA, 2007.
    Chieng, A. T., “Study on Numerical Scoring System for Consumer Credit”, National Chengchi University, Institute of Business Administration, 1997.
    Chiu, Y. C., “The Study of Risk Prediction Models for Car Loan”, National Cheng Kung University, Institute of Statistics, 2005.
    Chyi, Y. M., “Classification Analysis Techniques for Skewed Class Distribution Problems”, National Sun Yat-sen University, Institute of Information Management, 2003.
    Dai, J., “Study of Personal Consumer Loan Rating Model”, National Chung Cheng University, Institute of International Economics, 2004.
    Financial Supervisory Commission, Executive Yuan, “Index of financial statistic”, 2008.
    Hair, J.F., Jr., W.C. Black, B. Babin, R. Anderson, and R. Tatham, “Multivariate Data Analysis, 6th ed.”, Upper Saddle River, NJ: Prentice Hall, 2006.
    Han, J. and Kamber, M., “Data Mining: Concepts and Techniques”, San Francisco, CA: Morgan Kaufmann Publishers, 2001.
    Honda T., Motizuki H., Ho T. B. and Okumura M., “Generating Decision Trees from an Unbalanced Data Set”, Poster papers presented at the 9th European Conference on Machine Learning (ECML), edited by Maarten van Someren and Gerhard Widmer, p.p.68-77, 1997.
    Hung, R. J., Chung, W. Y., Qing R. Z., “Research into Policymaking of Credit Extension in Regional Financial Organizations- Taking Samples from Kaohsiung and Pintung”, Taiwan Earth and Finance Quarterly, Vol.37 No1, 2000.
    Witten I. H. and Frank, Eibe, “Data mining :practical machine learning tools and techniques 2nd ed”, San Francisco, Calif. :Elsevier, Morgan Kaufmann, 2005.

    Ikizler N. and Guvenir, H. A., “Mining Interesting Rules in Bank Loans Data”, Bilkent University, Department of Computer Engineering, 2000.
    Jiang, S. J., “Use a Fuzzy Nerve-like System to Assess Consumer Loans”, National Cheng Kung University, Institute of Industry Management, 2001.
    Kimoto, T. and Asakawa, K., “Stock market prediction system with modular neural network”. Proceedings of the International Joint Conference on Neural Networks (IJCNN1990), 1, San Diego, USA, pp.1-6, 1989.
    Lin R., “Multivariate Analysis: SPSS Operation and Application”, Best Wise Co., Ltd, 2007.
    Lin, M. C., “The research on evaluating the risk of consumer loans-taking X bank as an example”, Tatung university, Institute of Business Management, 2004.
    Ling, S. S., “Empirical Evaluations of Different Strategies for Classification with Skewed Class Distribution”, National Sun Yat-sen University, Institute of Information Management, 2004.
    Lu, C. I., “The Study on the Mortage Risk of Insurance Company with the Application of SOM”, National Kaohsiung First University of Science and Technology, Institute of Risk Management and Insurance , 2003.
    Nasir, M. L., John, R. I., Bennett, S.C. and Russell, D.M , “Predicting Corporate Bankruptcy Using Modular Neural Networks”, Proceedings of the IEEE/IAFE/INFROMS Conference on Computational Intelligence for Financial Egineering, p.p.86-91, 2000.

    P.E. Hart, “The condensed nearest neighbor rule”, IEEE Trans. Information Theory, Vol. 14, p.p.515-516, 1968.
    Quinlan, J.R., “C4.5 Programs for Machine Learning”, San Mateo, CA: Morgan Kaufmann, 1992.
    Rock, A., “Sure Ways to Score with Lenders”, Money, September, p.p.57, 1984.
    Steenackers, A. and Goovaerts, M. J. , “A Credit Scoring Model for Personal Loans”, Insurance Mathematics Economics, pp 31-34, 1989.
    Wang, H. Y., “Applying Data Mining to Telecom Churn Management”, National Chung Cheng University, Institute of Information Management, 2003.
    Wu, C. C., “The Research of Data Mining Techniques applied to Insurance CRM”, National Taiwan University of Science Technology, Institute of Industry Management, 2007.

    QR CODE