簡易檢索 / 詳目顯示

研究生: 黃柏翰
Po-Han Huang
論文名稱: 分佈式集成方案非線性支持向量機
A Distributed Ensemble Scheme for Nonlinear Support Vector Machine
指導教授: 李育杰
Yuh-Jye Lee
口試委員: 鮑興國
Hsing-Kuo Pao
楊傳凱
Chuan-kai Yang
葉倚任
Yi-Ren Yeh
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2015
畢業學年度: 103
語文別: 英文
論文頁數: 32
中文關鍵詞: 集成學習並行算法支持向量機
外文關鍵詞: Ensemble learning, parallel algorithm, SVM
相關次數: 點閱:254下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 我們想提出一個集成式的並行計算結構,我們稱之為分佈式集成支持向量機(DESVM),以克服大規模非線性支持向量機(SVM)的實際困難。該數據集分割成許多子集合,以便每個計算單元只需要建立一個更小的非線性SVM。但是,每個子集合可能仍然過大來讓傳統的SVM求解器處理。我們應用減少核技巧來對每個子集合產生非線性SVM分類器。每一個子集合所產生的非線性SVM可以被視為基於該部分的數據集的近似模型。然後,我們使用線性SVM分類器融合這些從所有數據子集合所產生的非線性SVM分類器。在這種線性SVM訓練模式,我們視每一個非線性SVM分類器作為一個屬性發生器或區域性專家。然後DESVM產生融合模型是非線性的SVM分類器的加權組合。該模型可以解釋為一個對區域性專家所作的加權投票決定。我們在五個基準數據集測試我們提出的方法。計算結果表明,DESVM是在準確具有競爭力,且有較高的加速。因此,DESVM對於大規模非線性可分的數據集來說,可以是一個強大的二元分類工具。


    We would like to propose an ensemble scheme with a parallel computational structure, we call it Distributed Ensemble Support Vector Machine (DESVM), to overcome the difficulties of large scale nonlinear Support Vector Machines (SVMs) in practice. The dataset is split into many strati ed partitions so that each computing unit only has to build a smaller nonlinear SVM. But each partition might be still too large for a conventional SVM solvers to solve. We apply the reduced kernel trick to generate a nonlinear SVM classifier for each partition. An nonlinear SVM for one partition can be treated as an approximation model based on that partial dataset. Then, we use a linear SVM classifier to fuse the nonlinear SVM classifiers which are generated from all data partitions. In this linear SVM training model, we treat each nonlinear SVM classifier as an attribute generator or an local expert. Then DESVM generates a fusion model which is a weighted combination of the nonlinear SVM classifiers. This model can be explained as a weighted voting decision made by a group of local experts. We test our proposed method on five benchmark datasets. The numerical results show that DESVM is competitive in accuracy and has a high speed-up. Thus, DESVM can be a powerful tool for binary classification problems with large scale and non-linearly separable datasets.

    1 Introduction 1 2 Related Work 4 2.1 Reduced Support Vector Machine . . 4 2.2 Ensemble Learning . . . . . . . . 7 2.2.1 Adaboost . . . . . . . . . . . . 7 2.2.2 Stacking . . . . . . . . . . . . 8 3 Distributed Ensemble Support Vector Machine 9 3.1 Training . . . . . . . . . . . . . 9 3.1.1 Distributed phase: Build local expert for data subset . . 11 3.1.2 Ensemble phase: Learning from local experts . . . . . . . 12 3.2 Predicting . . . . . . . . . . . . . . . . . . . . . . . . .14 3.3 Implementation details . . . . . . . . . . . . . . . . . . .15 3.3.1 Message Passing Interface Parallel Structure . . . . . . .15 II3.3.2 Training . . . . . . . . . . . . . . . . . . . . . . . .17 3.3.3 Testing . . . . . . . . . . . . . . . . . . . . . . . . . 19 4 Experiments 21 4.1 Preprocessing . . . . . . 22 4.2 Experiment Setting . . . .22 4.3 Experiment Results . . . .23 4.3.1 Parameter Setting . . . 24 4.3.2 Divide-and-Conquer Support Vector Machine . . 24 4.3.3 RSVM models vs. Fusion model . . . . . . . . .25 4.3.4 Computation time . . . . . . . . . . . . . . .26 4.3.5 Accuracy . . . . . . . . . . . . . . . . . . .27 4.3.6 Training Time vs. number of cores . . . . . . 28 5 Conclusion and Future Work 29 5.1 Conclusion . . . . . 29 5.2 Future Work . . . . .30

    [1] Christopher JC Burges. A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery, 2(2):121-167, 1998.
    [2] Vladimir Cherkassky and Filip M Mulier. Learning from data: concepts, theory, and methods. John Wiley & Sons, 2007.
    [3] Richard Courant and David Hilbert. Methods of mathematical physics, volume 1. CUP Archive, 1966.
    [4] Nello Cristianini and John Shawe-Taylor. An introduction to support vector machines and other kernel-based learning methods. Cambridge university press, 2000.
    [5] Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simpli ed data processing on large clusters. Communications of the ACM, 51(1):107-113, 2008.
    [6] Message P Forum. Mpi: A message-passing interface standard. Technical report, Knoxville, TN, USA, 1994.
    [7] Yoav Freund and Robert E Schapire. A desicion-theoretic generalization of on-line learning and an application to boosting. In Computational learning theory, pages 23-37. Springer, 1995.
    [8] Cho-Jui Hsieh, Si Si, and Inderjit S Dhillon. A divide-and-conquer solver for kernel support vector machines. arXiv preprint arXiv:1311.0914, 2013.
    [9] Chien-Ming Huang, Yuh-Jye Lee, Dennis KJ Lin, and Su-Yun Huang. Model selection for support vector machines via uniform design. Computational Statistics & Data Analysis, 52(1):335-346, 2007.
    [10] Jyrki Kivinen, Alexander J Smola, and Robert C Williamson. Online learning with kernels. Signal Processing, IEEE Transactions on, 52(8):2165{2176, 2004.
    [11] Ludmila I Kuncheva and Christopher J Whitaker. Measures of diversity in classi er ensembles and their relationship with the ensemble accuracy. Machine learning, 51(2):181-207, 2003.
    [12] Wei-Chih Lai, Po-Han Huang, Yuh-Jye Lee, and Alvin Chiang. A distributed ensemble scheme for nonlinear support vector machine. In Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), 2015 IEEE Tenth International Conference on, pages 1-6. IEEE, 2015.
    [13] Yuh-Jye Lee and Su-Yun Huang. Reduced support vector machines: A statistical theory. Neural Networks, IEEE Transactions on, 18(1):1-13, 2007.
    [14] Yuh-Jye Lee and Olvi L Mangasarian. Rsvm: Reduced support vector machines. In SDM, volume 1, pages 325-361. SIAM, 2001.
    [15] Yuh-Jye Lee and Olvi L Mangasarian. Ssvm: A smooth support vector machine for classi cation. Computational optimization and Applications, 20(1):5-22, 2001.
    [16] Richard Maclin and David Opitz. Popular ensemble methods: An empirical study. arXiv preprint arXiv:1106.0257, 2011.
    [17] Hua Ouyang and Alexander G Gray. Fast stochastic frank-wolfe algorithms for nonlinear svms. In SDM, pages 245-256. SIAM, 2010.
    [18] Ricardo Vilalta and Youssef Drissi. A perspective view and survey of meta-learning. Artificial Intelligence Review, 18(2):77-95, 2002.
    [19] Svante Wold, Kim Esbensen, and Paul Geladi. Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1):37-52, 1987.
    [20] David H Wolpert. Stacked generalization. Neural networks, 5(2):241-259, 1992.

    QR CODE