Basic Search / Detailed Display

Author: 黃建銘
Chien-Ming Huang
Thesis Title: 支撐向量機的自動參數選擇
Automatic Model Selection for Support Vector Machines
Advisor: 李育杰
Yuh-Jye Lee
Committee: 林共進
Dennis K.J. Lin
林智仁
Chih-Jen Lin
項天瑞
Tien-Ruey Hsiang
鮑興國
Hsing-Kuo Kenneth Pao
Degree: 碩士
Master
Department: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
Thesis Publication Year: 2005
Graduation Academic Year: 93
Language: 英文
Pages: 58
Keywords (in Chinese): 縮減集支撐向量機平滑支撐向量機模型選擇實驗設計ε-不敏感平滑支撐向量迴歸分析支撐向量機均勻設計
Keywords (in other languages): epsilon-insensitive support vector regression, model selection, reduced support vector machine, design of experiments, smooth support vector machine, support vector machine, uniform design
Reference times: Clicks: 330Downloads: 4
Share:
School Collection Retrieve National Library Collection Retrieve Error Report
  • 支撐向量機(support vector machines)在近年來成為最熱門的一種機器學習演算法。它被成功地應用在分類問題(classification problems)及ε-不敏感迴歸問題(ε-insensitive regression problems)。然而,支撐向量機的結果模型(resulting model)影響對於未知資料的預測能力。所以支撐向量機的模型選擇(model selection)是一個很重要的問題。
    現今常用的格子點(grid)法雖然十分的簡單,但是此方法非常的花費時間,所以很多的研究學者都致力於提出一個較快速的模型選擇方法。然而這些方法大部分都無法滿足使用者的需求。因此在這篇論文中,我們提出新的模型選擇方法。這個模型選擇的方法分別架構在實驗設計(design of experiments)及巢狀均勻設計(nested uniform design)。並且我們也提出了一個更有效率的參數搜尋範圍決定的演算法。結合這些方法,使用者可以很容易並且快速地得到良好預測能力的支撐向量機結果模型。
    在實驗中,我們不僅僅只把我們提出的模型選擇法應用在傳統的支撐向量機上,也應用在平滑支撐向量機(smooth support vector machine)、ε-不敏感平滑支撐向量迴歸分析(ε-insensitive smooth support vector regression)及縮減集支撐向量機(reduced support vector machine)上。實驗數據顯示我們的模型選擇方法在這些方法中不但效率很好,而且結果模型的預測能力也表現得相當好。


    Support vector machine learning algorithms (SVMs) recently have become the most recommendable
    learning algorithms because of their high generalization ability and sound
    theoretical based on the statistical learning theory. They are successfully applied to classification
    problems and epsilon-insensitive regression problems. A major open problem in SVMs
    is model selection, which tunes the kernel parameters and the control variable C.
    Model selection is usually done by minimizing an estimate of generalization error.
    The most common used grid method with k-fold cross-validation is obviously very clear
    and simple, but it also has an apparent shortcoming of time-consuming. The grid method
    along with estimators will take a lot of time in model selection, therefore many improved
    model selection methods have been proposed. However, those methods still have the
    difficulty to meet user’s requirements. Thus, we propose new model selection methods,
    which are based on the design of experiments (DOE) and the nested uniform design (UD)
    with an effective search range determination method. By using the proposed methods,
    the model selection for SVMs is easy and automatic to users.
    We perform experiments on several real world data sets and apply our model selection
    methods to not only conventional SVMs but also SSVM, epsilon-SSVR and RSVM successfully.
    The numerical results show that our proposed methods have higher efficiency
    while maintaining the performance of the SVMs model.

    1 Introduction 1 2 Support Vector Machine Learning Algorithms 4 2.1 Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.1 Support Vector Classification . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 ǫ-insensitive Support Vector Regression . . . . . . . . . . . . . . . . 11 2.2 Smooth Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 ǫ-insensitive Smooth Support Vector Regression . . . . . . . . . . . . . . . 15 2.4 Reduced Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . 17 3 Automatic Model Selection Algorithms 19 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 The Concept of Model Selection . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3 The Search Range Determination . . . . . . . . . . . . . . . . . . . . . . . 23 3.4 The Design of Experiments Based Model Selection Method . . . . . . . . . 25 3.5 The Nested Uniform Design Based Model Selection Method . . . . . . . . 27 4 Experiments and Results 30 4.1 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5 Conclusion and Further Work 39 5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.2 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    [1] A. Ben-Israel, A. Ben-Tal, and S. Zlobec. Optimality in Nonlinear Programming: A
    Feasible Directions Approach. John Wiley & Sons, New York, 1981.
    [2] D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, second
    edition, 1999.
    [3] P. S. Bradley and O. L. Mangasarian. Massive Data Discrimination via Linear
    Support Vector Machines. Optimization Methods and Software, 13:1–10, 2000.
    ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-03.ps.
    [4] P. S. Bradley, O. L. Mangasarian, and D. R. Musicant. Optimization Methods
    in Massive Datasets. Technical Report 99-01, Data Mining Institute, Computer
    Sciences Department, University of Wisconsin, Madison, Wisconsin, June 1999.
    ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/99-01.ps.
    [5] C. J. C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition.
    Data Mining and Knowledge Discovery, 2(2):121–167, 1998.
    [6] B Caputo, K Sim, F Furesjo, and A Smola. Appearance-based Object Recognition
    using SVMs: Which Kernel Should I Use? In Proc of NIPS workshop on Statistical
    methods for computational experiments in visual processing and computer vision,
    Whistler, 2002, 2002.
    [7] Chin-Chung Chang and Chih-Jen Lin. LIBSVM: A Library for Support Vector Ma-
    chines, April 2005.
    [8] Olivier Chapelle, Vladimir Vapnik, Olivier Bousquet, and Sayan Mukherjee. Ma-
    chine Learning. Springer Science+Business Media B.V., Formerly Kluwer Academic
    Publishers B.V. Formerly Kluwer Academic Publishers B.V., 2002.
    [9] C. Chen and O. L. Mangasarian. Smoothing Methods for Convex Inequalities and
    Linear Complementarity Problems. Mathematical Programming, 71(1):51–69, 1995.
    [10] C. Chen and O. L. Mangasarian. A Class of Smoothing Functions for Nonlinear and
    Mixed Complementarity Problems. Computational Optimization and Applications,
    5(2):97–138, 1996.
    [11] V. Cherkassky and F. Mulier. Learning from Data - Concepts, Theory and Methods.
    John Wiley & Sons, New York, 1998.
    [12] R. Courant and D. Hilbert. Methods of Mathematical Physics. Interscience Publishers,
    New York, 1953.
    [13] T. M. Cover and P. E. Hart. Nearest Neighbor Pattern Classification. IEEE Trans-
    action on Information Theory, 13:21–27, 1967.
    [14] N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines.
    Cambridge University Press, Cambridge, 2000.
    [15] K.T. Fang, D.K.J. Lin, P. Winker, and Y. Zhang. Uniform Design: Theory and
    Applications, 2000.
    [16] R.A. Fisher. The Design of Experiments. Edinburgh and London and Oliver and
    Boyd, 1935.
    [17] S. R. Gunn. Support Vector Machines for Classification and Regression. Technical
    report, Image Speech and Intelligent Systems Research Group, University of
    Southampton, 1997.
    [18] S. Hettich, C.L. Blake, and C.J. Merz. UCI Repository of Machine Learning
    Databases, 1998.
    [19] C.-W. Hsu and C.-J. Lin. A Simple Decomposition Method for Support Vector
    Machines. Machine Learning, 46:291–314, 2002.
    [20] S.-Y. Huang and Y.-J. Lee. Reduced Support Vector Machines: A Statistical Theory.
    Technical report, Institute of Statistical Science, Academia Sinica, 2004.
    [21] T. Joachims. Learning to Classify Text Using Support Vector Machines. Kluwer
    Academic Publishers, Norwell, Massachusetts, 2002.
    [22] T. Joachims. SVMlight, 2002. http://svmlight.joachims.org.
    [23] Thorsten Joachims. Making Large-Scale Support Vector Machine Learning Practical,
    pages 169–184. MIT Press, Cambridge, MA, USA, 1999.
    [24] Ron Kahavi. A Study of Cross-Validation and Bootstrap for Accuracy Estimation
    and Model Selection. In International Joint Conference on Artificial Intelligence,
    1995.
    [25] S. Sathiya Keerthi. Efficient Tuning of SVM Hyperparameters Using Radius/Margin
    Bound and Iterative Algorithms. IEEE TRANSACTIONS ON NEURAL NET-
    WORKS, 13:1225–1229, 2002.
    [26] S. Sathiya Keerthi and Chih-Jen Lin. Asymptotic Behaviors of Support Vector Machines
    with Gaussian Kernel. Neural Comput., 15(7):1667–1689, 2003.
    [27] Y.-J. Lee and O. L. Mangasarian. RSVM: Reduced Support Vector Machines. Technical
    Report 00-07, Data Mining Institute, Computer Sciences Department, University
    of Wisconsin, Madison, Wisconsin, July 2000. Proceedings of the First SIAM International
    Conference on Data Mining, Chicago, April 5-7, 2001, CD-ROM Proceedings.
    ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/00-07.ps.
    [28] Y.-J. Lee and O. L. Mangasarian. SSVM: A Smooth Support Vector Machine. Com-
    putational Optimization and Applications, 20:5–22, 2001. Data Mining Institute, University
    of Wisconsin, Technical Report 99-03. ftp://ftp.cs.wisc.edu/pub/dmi/techreports/
    99-03.ps.
    [29] Yuh-Jye Lee, Wen-Feng Hsieh, and Chien-Ming Huang. ǫ-SSVR: A Smooth Support
    Vector Machine for ǫ-insensitive Regression. IEEE Transactions on Knowledge and
    Data Engineering, 17(5):678–685, 2005.
    [30] David D. Lewis. Naive (Bayes) at Forty: The Independence Assumption in Infor-
    mation Retrieval. AT&T Labs – Research, 180 Park Avenue Florham Park, NJ
    07932-0971 USA.
    [31] H.-T. Lin and C.-J. Lin. A Study on Sigmoid Kernels for SVM and the Training of
    Non-PSD Kernels by SMO-type Methods. Technical report, Department of Computer
    Science and Information Engineering, National Taiwan University, 2003.
    [32] O. L. Mangasarian. Mathematical Programming in Neural Networks. ORSA Journal
    on Computing, 5(4):349–360, 1993.
    [33] O. L. Mangasarian. Nonlinear Programming. SIAM, Philadelphia, PA, 1994.
    [34] O. L. Mangasarian. Generalized Support Vector Machines. In A. Smola, P. Bartlett,
    B. Sch¨olkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers,
    pages 135–146, Cambridge, MA, 2000. MIT Press. ftp://ftp.cs.wisc.edu/mathprog/
    tech-reports/98-14.ps.
    [35] O. L. Mangasarian and D. R. Musicant. Successive Overrelaxation for Support
    Vector Machines. IEEE Transactions on Neural Networks, 10:1032–1037, 1999.
    ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-18.ps.
    [36] MATLAB. User’s Guide. The MathWorks, Inc., Natick, MA 01760, 1994-2001.
    http://www.mathworks.com.
    [37] T. M. Mitchell. Machine Learning. McGraw-Hill, Boston, 1997.
    [38] D. R. Musicant and A. Feinberg. Active Set Support Vector Regression. IEEE
    Transactions on Neural Networks, 15:268–275, 2004.
    [39] J. Nocedal and S. J. Wright. Numerical Optimization. Springer-Verlag, New York,
    1999.
    [40] Mark J. L. Orr. Recent Advances in Radial Basis Function Networks. Technical
    report, Institute for Adaptive and Neural Computation, June 1999.
    [41] E. Osuna, R. Freund, and F. Girosi. Improved Training Algorithm for Support Vector
    Machines. In Proceedings of IEEE NNSP’97, Amelia Island, FL, September 1997,
    276-285, New York, 1997. IEEE Press. http://www.ai.mit.edu/people/girosi/homepage/
    svm.html.
    [42] Edgar Osuna, Federico Girosi, and Robert Freund. Training Support Vector Machines:
    An Application to Face Detection, 1997.
    [43] M. L. Overton and R.S. Womersley. Optimality Conditions and Duality Theory for
    Minimizing Sums of the Largest Eigenvalues of Symmetric Matrices. Mathematical
    Programming, 62:321–357, 1993.
    [44] J. C. Platt. Sequential Minimal Optimization: A Fast Algorithm for Training Support
    Vector Machines. In B. Sch¨olkopf, C. J. C. Burges, and A. J. Smola, editors,
    Advances in Kernel Methods - Support Vector Learning, pages 185–208. MIT Press,
    1999. http://www.research.microsoft.com/∼jplatt/smo.html.
    [45] John C. Platt. Fast Training of Support Vector Machines Using Sequential Minimal
    Optimization. MIT Press, Cambridge, MA, USA, 1999.
    [46] C.E. Rasmussen. DELVE-Data for Evaluating Learning in Valid Experiments.
    http://www.cs.toronto.edu/∼delve/data/comp-activ/desc.html.
    [47] C.E. Rasmussen. DELVE-Data for Evaluating Learning in Valid Experiments.
    http://www.cs.toronto.edu/∼delve/data/kin/desc.html.
    [48] G. R¨atsch. Benchmark Data Sets, 1999.
    [49] Carl Staelin. Parameter Selection for Support Vector Machines. Technical report,
    HP Laboratories Israel, November 2003.
    [50] M. Stone. Cross-validatory Choice and Assessment of Statistical Predictions. Journal
    of the Royal Statistical Society, 36:111–147, 1974.
    [51] V. N. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.
    [52] J. J. Ward, L. J. McGuffin, B. F. Buxton, and D. T. Jones. Secondary Structure
    Prediction with Support Vector Machine. Bioinformatics, 19:1650–1655, 2003.
    [53] I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and
    Techniques with Java Implementations. Morgan Kaufmann, San Francisco, CA, 1999.

    QR CODE