簡易檢索 / 詳目顯示

研究生: Otgonkhishig Ganbayar
Otgonkhishig Ganbayar
論文名稱: Predicting Credit Risk of Online Peer to Peer Lending by Applying Bagging and Random Forest Ensemble
Predicting Credit Risk of Online Peer to Peer Lending by Applying Bagging and Random Forest Ensemble
指導教授: 洪西進
Shi-Jinn Horng
口試委員: 吳怡樂
Yi-Leh Wu
金台齡
Tai-lin Chin
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 31
中文關鍵詞: credit scoringbaggingrandom forest ensemblep2p lendingentropy based feature selection
外文關鍵詞: credit scoring, bagging, random forest ensemble, p2p lending, entropy based feature selection
相關次數: 點閱:355下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

In his research thesis, we aim to analyze credit risk of Online Peer-to-Peer (P2P) lending that is the platform where individuals and businesses lend or borrow money each other through internet without any financial institution like bank. Even though the P2P system gives borrowers and investors some advantages comparing to bank deposit, it faces with a risk of the loan that is not repaid. The Lending Club platform’s publicly available 2015- 2017 loan historical dataset is used in that research. The raw datasets are preprocessed with some filtering method of cleaning data and resampled for training due to imbalance of the initial dataset. We proposed Bagging and Random Forest Ensemble machine learning algorithms for classification of loan status as good or bad loan and Entropy Based Feature Selection method for preprocessing techniques to explore, analyze and determine the factors which play crucial role in predicting the credit risk. The algorithms are optimized to distinguish the potential good loans whilst identifying defaults or bad loans. As well, other machine learning algorithms are applied to compare our proposed method’s effectiveness. The experiment results show that our proposed method can effectively raise the prediction accuracy for default risk.


In his research thesis, we aim to analyze credit risk of Online Peer-to-Peer (P2P) lending that is the platform where individuals and businesses lend or borrow money each other through internet without any financial institution like bank. Even though the P2P system gives borrowers and investors some advantages comparing to bank deposit, it faces with a risk of the loan that is not repaid. The Lending Club platform’s publicly available 2015- 2017 loan historical dataset is used in that research. The raw datasets are preprocessed with some filtering method of cleaning data and resampled for training due to imbalance of the initial dataset. We proposed Bagging and Random Forest Ensemble machine learning algorithms for classification of loan status as good or bad loan and Entropy Based Feature Selection method for preprocessing techniques to explore, analyze and determine the factors which play crucial role in predicting the credit risk. The algorithms are optimized to distinguish the potential good loans whilst identifying defaults or bad loans. As well, other machine learning algorithms are applied to compare our proposed method’s effectiveness. The experiment results show that our proposed method can effectively raise the prediction accuracy for default risk.

TABLE OF CONTENT ABSTRACT I ACKNOWLEDGEMENTS II TABLE OF CONTENT III CHAPTER I INTRODUCTION I.1. Background and Motivation 1 I.2. Research Objectives 2 I.3. Research Structure 3 CHAPTER II LITERATURE REVIEW II.1. Credit Scoring on Peer-to-Peer Lending 4 II.2. Linear Classification Techniques 5 CHAPTER III DATA DESCRIPTION III.1. Dataset Description 8 III.2. Data Preprocessing and Cleaning 11 CHAPTER IV RESEARCH METHODOLOGY 1 IV.1. Entropy based Feature selection 12 IV.2. Bagging and Random Forest Ensemble 14 CHAPTER V EXPERIMENTAL RESULT V.1. Experimental Results 16 V.2. Discussion 23 CHAPTER VI CONCLUSION VI.1. Conclusion 28 VI.2. Future work 29 REFERENCES 31

REFERENCES
[1]. Yu, J., et al. A Data-driven Approach to Predict Default Risk of Loan for Online Peer-to-Peer (P2P) Lending; © 2015 IEEE Conference
[2]. Evan, B., et al. PEER-TO-PEER LENDING: How Digital Lending Marketplaces are Disrupting the Predominant Banking Model; Available from: Business Insider com.
[3]. Bahrammirzaee, A., et al. A Comparative Survey of Artificial Intelligence Applications in Finance: Artificial Neural Networks, Expert System and Hybrid Intelligent Systems. Neural Comput. Appl., pp. 1165-1195, 2010.
[4]. James, M., et al. The History of P2P Lending; Available from: Learn with OFF3R com
[5]. Ajay, B., et al. Predicting Credit Risk in Peer-to-Peer Lending: A Neural Network Approach; © 2015 IEEE Conference
[6]. Lending Club Statistics; Available from: “www.lendingclub.com” © 2006-2018
[7]. Zhi-Hua, Z., et al. Ensemble Learning; Nanjing University, China
[8]. Ruhai, L., et at. An Ensemble SVM Using Entropy-Based Attribute Selection; © 2010 IEEE Confrence
[9]. Han, J., et al. Data Mining: Concepts and Techniques 2001 Book
[10]. Pedregosa, et al, Scikit-learn: Machine Learning in Python; JMLR 12, pp. 2825-2830, 2011.
[11]. John, D., Machine Learning for Predictive Data Analytics; © 2015 pp 183-246 Book

QR CODE