簡易檢索 / 詳目顯示

研究生: 黃植暄
Chih-Hsuan Huang
論文名稱: 應用蜂群演算法為基礎之屬性選擇及參數最佳化之支撐向量機於規則萃取
Artificial Bee Colony-based Support Vector Machines with Feature Selection and Parameters Optimization for Rule Extraction
指導教授: 郭人介
Ren-Jieh Kuo
口試委員: 喻奉天
Vincent F. Yu
Shi-Woei Lin
學位類別: 碩士
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2014
畢業學年度: 102
語文別: 英文
論文頁數: 108
中文關鍵詞: 支撐向量機蜂群演算法決策樹分類規則萃取屬性選擇參數最佳化
外文關鍵詞: Support vector machines, Artificial bee colony, Decision tree, Classification, Rule extraction, Feature selection, Parameters optimization
相關次數: 點閱:394下載:5
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

分類問題在資料探勘中是一個重要的議題,而支撐向量機(Support Vector Machines, SVMs)靠著解決非線性問題的能力得到較好的分類準確性,在現今解決分類問題上扮演著相當重要的角色,但是它的優點伴隨著一個缺點,就像其它產生非線性模型的方法一樣,被認為是不可理解的黑箱型分類方法,這個缺點是非常嚴重的,因為在許多特定的應用領域上,可理解性是必須的,像是在醫療診斷的應用上。為了解決這個問題,有研究提出了使用主動式學習為基礎之方法(Active Learning-Based Approach, ALBA),一種藉由決策樹(Decision Tree, DT)從支撐向量機裡萃取決策規則的方法,此方法確實可以從支撐向量機裡得到可理解的分類規則,同時也改善了單純使用決策樹時的分類準確率,但是ALBA在最終階段產出的決策樹,一般來說都非常複雜,因此在本研究我們提出了ABC-SVM-DT演算法來從支撐向量機裡萃取決策規則,其中蜂群演算法(Artificial Bee Colony Algorithm, ABC)在其中扮演著資料在進入SVM-DT前,挑選屬性以及最佳化參數的角色,為了驗證此方法的有效性,我們使用的UCI裡的六個資料集去做驗證,結果顯示本研究提出的ABC-SVM-DT 演算法可以同時改善分類準確率以及最終決策樹之複雜度。

Classification is an important issue in data mining. Support vector machines (SVMs) are currently state-of-art for the classification task and, generally speaking, exhibit good classification accuracy due to their ability to nonlinearities. However, its strength is also its main weakness, as the generated nonlinear models are typically regarded as incomprehensible black-box models. This inability in explanation of SVMs is a serious problem in those particular applications which require the comprehensibility such as medical diagnosis. In order to overcome this inability, previous research proposed the Active Learning-Based Approach (ALBA) to extract the comprehensible rules from SVMs using decision tree classifier. The result shows ALBA actually can extract rules and improve the classification accuracy from original decision tree. However, the final decision trees generated by ALBA are generally too complicated. Thus, this study proposes a novel artificial bee colony-SVM-decision tree (ABC-SVM-DT) algorithm to extract comprehensible rules from SVMs. The ABC algorithm is applied to implement feature selection and parameters optimization before SVM-DT. For evaluation, six datasets from the UCI Machine Learning Repository are employed to demonstrate the effectiveness of the proposed algorithm. The result shows the classification accuracy and the complexity of final decision tree can be improved simultaneously by the proposed ABC-SVM-DT algorithm.

摘要 I ABSTRACT II 誌謝 III CONTENTS IV LIST OF TABLES VII LIST OF FIGURES VIII CHAPTER 1 INTRODUCTION 1 1.1 Background 1 1.2 Research Objectives 3 1.3 Research Scope 4 1.4 Thesis Organization 4 CHAPTER 2LITERATURE REVIEW 6 2.1 The Support Vector Machines 6 2.1.1 The Linear SVM (separable data) 6 2.1.2 The Linear Generalized SVM (nonseparable data) 8 2.1.3 Non-Linear SVM 9 2.2 Rule Extraction from SVMs 10 2.2.1 Utilizing SVM as a Black Box 12 2.2.2 Utilizing SVs in SVM only 13 2.2.3 Utilizing SVs and the Separating Hyper-plane in SVM 15 2.2.4 Utilizing SVs, Training Data and Separating Hyper-plane 16 2.3 The Support Vector Machine with Metaheuristics 18 2.3.1 SVM with Genetic Algorithm 19 2.3.2 Particle Swarm Optimization with SVM 20 2.3.3 Artificial Bee Colony with SVM 21 2.4 Decision Tree 22 CHAPTER 3 METHODOLOGY 24 3.1 Preprocessing and Defining Phase 24 3.2 Artificial Bee Colony Phase 26 3.3 Support Vector Machine-Decision Tree Phase 27 CHAPTER 4 NUMERICAL ILLUSTRATION 30 4.1 Datasets 30 4.2 Evaluation Criterion 30 4.3 Experiment Setup 31 4.3.1 Parameter Determination 32 4.4 Experiment result 37 4.4.1 Statistic Test 40 4.4.2 Comparison between Original DT and SVM-DT 48 4.4.3 Comparison among SVM-DT and the Feature Selection and Parameters Optimization based SVM-DT (PSO-SVM-DT, GA-SVM-DT and ABC-SVM-DT) 55 CHAPTER 5 CONCLUSIONS AND FUTURE RESEARCH 60 5.1 Conclusions 60 5.2 Contributions 61 5.3 Future Research 61 REFERENCES 63 APPENDIX 72 Appendix I- Experiment results of full factorial design for ABC-SVM-DT. 72 Appendix II- Experiment results of Taguchi method (citation) for GA-SVM-DT. 73 Appendix III- Experiment results of Taguchi method (citation) for PSO-SVM-DT. 76 Appendix IV- Experiment results of Iris dataset using ABC-SVM-DT. 79 Appendix V- Experiment results of Wine dataset using ABC-SVM-DT. 80 Appendix VI- Experiment results of Ionosphere dataset using ABC-SVM-DT. 81 Appendix VII- Experiment results of Sonar dataset using ABC-SVM-DT. 82 Appendix VIII- Experiment results of Cmc dataset using ABC-SVM-DT. 83 Appendix IX- Experiment results of German dataset using ABC-SVM-DT. 84 Appendix X- Experiment results of Iris dataset using GA-SVM-DT. 85 Appendix XI- Experiment results of Wine dataset using GA-SVM-DT. 86 Appendix XII- Experiment results of Ionosphere dataset using GA-SVM-DT. 87 Appendix XIII- Experiment results of Sonar dataset using GA-SVM-DT. 88 Appendix XIV- Experiment results of Cmc dataset using GA-SVM-DT. 89 Appendix XV- Experiment results of German dataset using GA-SVM-DT. 90 Appendix XVI- Experiment results of Iris dataset using PSO-SVM-DT. 91 Appendix XVII- Experiment results of Wine dataset using PSO-SVM-DT. 92 Appendix XVIII- Experiment results of Ionosphere dataset using PSO-SVM-DT. 93 Appendix XIX- Experiment results of Sonar dataset using PSO-SVM-DT. 94 Appendix XX- Experiment results of Cmc dataset using PSO-SVM-DT. 95 Appendix XXI- Experiment results of German dataset using PSO-SVM-DT. 96 Appendix XXII- Experiment results of Iris dataset using SVM-DT. 97 Appendix XXIII- Experiment results of Wine dataset using SVM-DT. 98 Appendix XXIV- Experiment results of Ionosphere dataset using SVM-DT. 99 Appendix XXV- Experiment results of Sonar dataset using SVM-DT. 100 Appendix XXVI- Experiment results of Cmc dataset using SVM-DT. 101 Appendix XXVII- Experiment results of German dataset using SVM-DT. 102 Appendix XXVIII- Experiment results of Iris dataset using DT. 103 Appendix XXIX- Experiment results of Wine dataset using DT. 104 Appendix XXX- Experiment results of Ionosphere dataset using DT. 105 Appendix XXXI- Experiment results of Sonar dataset using DT. 106 Appendix XXXII- Experiment results of Cmc dataset using DT. 107 Appendix XXXIII- Experiment results of German dataset using DT. 108

