Basic Search / Detailed Display

Author: 黃建銘
Chien-Ming Huang
Thesis Title: 支撐向量機的自動參數選擇
Automatic Model Selection for Support Vector Machines
Advisor: 李育杰
Yuh-Jye Lee
Committee: 林共進
Dennis K.J. Lin
林智仁
Chih-Jen Lin
項天瑞
Tien-Ruey Hsiang
鮑興國
Hsing-Kuo Kenneth Pao
Degree: 碩士
Master
Department: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
Thesis Publication Year: 2005
Graduation Academic Year: 93
Language: 英文
Pages: 58
Keywords (in Chinese): 縮減集支撐向量機平滑支撐向量機模型選擇實驗設計ε-不敏感平滑支撐向量迴歸分析支撐向量機均勻設計
Keywords (in other languages): epsilon-insensitive support vector regression, model selection, reduced support vector machine, design of experiments, smooth support vector machine, support vector machine, uniform design
Reference times: Clicks: 286Downloads: 4
Share:
School Collection Retrieve National Library Collection Retrieve Error Report

支撐向量機(support vector machines)在近年來成為最熱門的一種機器學習演算法。它被成功地應用在分類問題(classification problems)及ε-不敏感迴歸問題(ε-insensitive regression problems)。然而,支撐向量機的結果模型(resulting model)影響對於未知資料的預測能力。所以支撐向量機的模型選擇(model selection)是一個很重要的問題。
現今常用的格子點(grid)法雖然十分的簡單,但是此方法非常的花費時間,所以很多的研究學者都致力於提出一個較快速的模型選擇方法。然而這些方法大部分都無法滿足使用者的需求。因此在這篇論文中,我們提出新的模型選擇方法。這個模型選擇的方法分別架構在實驗設計(design of experiments)及巢狀均勻設計(nested uniform design)。並且我們也提出了一個更有效率的參數搜尋範圍決定的演算法。結合這些方法,使用者可以很容易並且快速地得到良好預測能力的支撐向量機結果模型。
在實驗中,我們不僅僅只把我們提出的模型選擇法應用在傳統的支撐向量機上,也應用在平滑支撐向量機(smooth support vector machine)、ε-不敏感平滑支撐向量迴歸分析(ε-insensitive smooth support vector regression)及縮減集支撐向量機(reduced support vector machine)上。實驗數據顯示我們的模型選擇方法在這些方法中不但效率很好,而且結果模型的預測能力也表現得相當好。


Support vector machine learning algorithms (SVMs) recently have become the most recommendable
learning algorithms because of their high generalization ability and sound
theoretical based on the statistical learning theory. They are successfully applied to classification
problems and epsilon-insensitive regression problems. A major open problem in SVMs
is model selection, which tunes the kernel parameters and the control variable C.
Model selection is usually done by minimizing an estimate of generalization error.
The most common used grid method with k-fold cross-validation is obviously very clear
and simple, but it also has an apparent shortcoming of time-consuming. The grid method
along with estimators will take a lot of time in model selection, therefore many improved
model selection methods have been proposed. However, those methods still have the
difficulty to meet user’s requirements. Thus, we propose new model selection methods,
which are based on the design of experiments (DOE) and the nested uniform design (UD)
with an effective search range determination method. By using the proposed methods,
the model selection for SVMs is easy and automatic to users.
We perform experiments on several real world data sets and apply our model selection
methods to not only conventional SVMs but also SSVM, epsilon-SSVR and RSVM successfully.
The numerical results show that our proposed methods have higher efficiency
while maintaining the performance of the SVMs model.

1 Introduction 1 2 Support Vector Machine Learning Algorithms 4 2.1 Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.1 Support Vector Classification . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 ǫ-insensitive Support Vector Regression . . . . . . . . . . . . . . . . 11 2.2 Smooth Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 ǫ-insensitive Smooth Support Vector Regression . . . . . . . . . . . . . . . 15 2.4 Reduced Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . 17 3 Automatic Model Selection Algorithms 19 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 The Concept of Model Selection . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3 The Search Range Determination . . . . . . . . . . . . . . . . . . . . . . . 23 3.4 The Design of Experiments Based Model Selection Method . . . . . . . . . 25 3.5 The Nested Uniform Design Based Model Selection Method . . . . . . . . 27 4 Experiments and Results 30 4.1 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5 Conclusion and Further Work 39 5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.2 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

[1] A. Ben-Israel, A. Ben-Tal, and S. Zlobec. Optimality in Nonlinear Programming: A
Feasible Directions Approach. John Wiley & Sons, New York, 1981.
[2] D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, second
edition, 1999.
[3] P. S. Bradley and O. L. Mangasarian. Massive Data Discrimination via Linear
Support Vector Machines. Optimization Methods and Software, 13:1–10, 2000.
ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-03.ps.
[4] P. S. Bradley, O. L. Mangasarian, and D. R. Musicant. Optimization Methods
in Massive Datasets. Technical Report 99-01, Data Mining Institute, Computer
Sciences Department, University of Wisconsin, Madison, Wisconsin, June 1999.
ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/99-01.ps.
[5] C. J. C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition.
Data Mining and Knowledge Discovery, 2(2):121–167, 1998.
[6] B Caputo, K Sim, F Furesjo, and A Smola. Appearance-based Object Recognition
using SVMs: Which Kernel Should I Use? In Proc of NIPS workshop on Statistical
methods for computational experiments in visual processing and computer vision,
Whistler, 2002, 2002.
[7] Chin-Chung Chang and Chih-Jen Lin. LIBSVM: A Library for Support Vector Ma-
chines, April 2005.
[8] Olivier Chapelle, Vladimir Vapnik, Olivier Bousquet, and Sayan Mukherjee. Ma-
chine Learning. Springer Science+Business Media B.V., Formerly Kluwer Academic
Publishers B.V. Formerly Kluwer Academic Publishers B.V., 2002.
[9] C. Chen and O. L. Mangasarian. Smoothing Methods for Convex Inequalities and
Linear Complementarity Problems. Mathematical Programming, 71(1):51–69, 1995.
[10] C. Chen and O. L. Mangasarian. A Class of Smoothing Functions for Nonlinear and
Mixed Complementarity Problems. Computational Optimization and Applications,
5(2):97–138, 1996.
[11] V. Cherkassky and F. Mulier. Learning from Data - Concepts, Theory and Methods.
John Wiley & Sons, New York, 1998.
[12] R. Courant and D. Hilbert. Methods of Mathematical Physics. Interscience Publishers,
New York, 1953.
[13] T. M. Cover and P. E. Hart. Nearest Neighbor Pattern Classification. IEEE Trans-
action on Information Theory, 13:21–27, 1967.
[14] N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines.
Cambridge University Press, Cambridge, 2000.
[15] K.T. Fang, D.K.J. Lin, P. Winker, and Y. Zhang. Uniform Design: Theory and
Applications, 2000.
[16] R.A. Fisher. The Design of Experiments. Edinburgh and London and Oliver and
Boyd, 1935.
[17] S. R. Gunn. Support Vector Machines for Classification and Regression. Technical
report, Image Speech and Intelligent Systems Research Group, University of
Southampton, 1997.
[18] S. Hettich, C.L. Blake, and C.J. Merz. UCI Repository of Machine Learning
Databases, 1998.
[19] C.-W. Hsu and C.-J. Lin. A Simple Decomposition Method for Support Vector
Machines. Machine Learning, 46:291–314, 2002.
[20] S.-Y. Huang and Y.-J. Lee. Reduced Support Vector Machines: A Statistical Theory.
Technical report, Institute of Statistical Science, Academia Sinica, 2004.
[21] T. Joachims. Learning to Classify Text Using Support Vector Machines. Kluwer
Academic Publishers, Norwell, Massachusetts, 2002.
[22] T. Joachims. SVMlight, 2002. http://svmlight.joachims.org.
[23] Thorsten Joachims. Making Large-Scale Support Vector Machine Learning Practical,
pages 169–184. MIT Press, Cambridge, MA, USA, 1999.
[24] Ron Kahavi. A Study of Cross-Validation and Bootstrap for Accuracy Estimation
and Model Selection. In International Joint Conference on Artificial Intelligence,
1995.
[25] S. Sathiya Keerthi. Efficient Tuning of SVM Hyperparameters Using Radius/Margin
Bound and Iterative Algorithms. IEEE TRANSACTIONS ON NEURAL NET-
WORKS, 13:1225–1229, 2002.
[26] S. Sathiya Keerthi and Chih-Jen Lin. Asymptotic Behaviors of Support Vector Machines
with Gaussian Kernel. Neural Comput., 15(7):1667–1689, 2003.
[27] Y.-J. Lee and O. L. Mangasarian. RSVM: Reduced Support Vector Machines. Technical
Report 00-07, Data Mining Institute, Computer Sciences Department, University
of Wisconsin, Madison, Wisconsin, July 2000. Proceedings of the First SIAM International
Conference on Data Mining, Chicago, April 5-7, 2001, CD-ROM Proceedings.
ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/00-07.ps.
[28] Y.-J. Lee and O. L. Mangasarian. SSVM: A Smooth Support Vector Machine. Com-
putational Optimization and Applications, 20:5–22, 2001. Data Mining Institute, University
of Wisconsin, Technical Report 99-03. ftp://ftp.cs.wisc.edu/pub/dmi/techreports/
99-03.ps.
[29] Yuh-Jye Lee, Wen-Feng Hsieh, and Chien-Ming Huang. ǫ-SSVR: A Smooth Support
Vector Machine for ǫ-insensitive Regression. IEEE Transactions on Knowledge and
Data Engineering, 17(5):678–685, 2005.
[30] David D. Lewis. Naive (Bayes) at Forty: The Independence Assumption in Infor-
mation Retrieval. AT&T Labs – Research, 180 Park Avenue Florham Park, NJ
07932-0971 USA.
[31] H.-T. Lin and C.-J. Lin. A Study on Sigmoid Kernels for SVM and the Training of
Non-PSD Kernels by SMO-type Methods. Technical report, Department of Computer
Science and Information Engineering, National Taiwan University, 2003.
[32] O. L. Mangasarian. Mathematical Programming in Neural Networks. ORSA Journal
on Computing, 5(4):349–360, 1993.
[33] O. L. Mangasarian. Nonlinear Programming. SIAM, Philadelphia, PA, 1994.
[34] O. L. Mangasarian. Generalized Support Vector Machines. In A. Smola, P. Bartlett,
B. Sch¨olkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers,
pages 135–146, Cambridge, MA, 2000. MIT Press. ftp://ftp.cs.wisc.edu/mathprog/
tech-reports/98-14.ps.
[35] O. L. Mangasarian and D. R. Musicant. Successive Overrelaxation for Support
Vector Machines. IEEE Transactions on Neural Networks, 10:1032–1037, 1999.
ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-18.ps.
[36] MATLAB. User’s Guide. The MathWorks, Inc., Natick, MA 01760, 1994-2001.
http://www.mathworks.com.
[37] T. M. Mitchell. Machine Learning. McGraw-Hill, Boston, 1997.
[38] D. R. Musicant and A. Feinberg. Active Set Support Vector Regression. IEEE
Transactions on Neural Networks, 15:268–275, 2004.
[39] J. Nocedal and S. J. Wright. Numerical Optimization. Springer-Verlag, New York,
1999.
[40] Mark J. L. Orr. Recent Advances in Radial Basis Function Networks. Technical
report, Institute for Adaptive and Neural Computation, June 1999.
[41] E. Osuna, R. Freund, and F. Girosi. Improved Training Algorithm for Support Vector
Machines. In Proceedings of IEEE NNSP’97, Amelia Island, FL, September 1997,
276-285, New York, 1997. IEEE Press. http://www.ai.mit.edu/people/girosi/homepage/
svm.html.
[42] Edgar Osuna, Federico Girosi, and Robert Freund. Training Support Vector Machines:
An Application to Face Detection, 1997.
[43] M. L. Overton and R.S. Womersley. Optimality Conditions and Duality Theory for
Minimizing Sums of the Largest Eigenvalues of Symmetric Matrices. Mathematical
Programming, 62:321–357, 1993.
[44] J. C. Platt. Sequential Minimal Optimization: A Fast Algorithm for Training Support
Vector Machines. In B. Sch¨olkopf, C. J. C. Burges, and A. J. Smola, editors,
Advances in Kernel Methods - Support Vector Learning, pages 185–208. MIT Press,
1999. http://www.research.microsoft.com/∼jplatt/smo.html.
[45] John C. Platt. Fast Training of Support Vector Machines Using Sequential Minimal
Optimization. MIT Press, Cambridge, MA, USA, 1999.
[46] C.E. Rasmussen. DELVE-Data for Evaluating Learning in Valid Experiments.
http://www.cs.toronto.edu/∼delve/data/comp-activ/desc.html.
[47] C.E. Rasmussen. DELVE-Data for Evaluating Learning in Valid Experiments.
http://www.cs.toronto.edu/∼delve/data/kin/desc.html.
[48] G. R¨atsch. Benchmark Data Sets, 1999.
[49] Carl Staelin. Parameter Selection for Support Vector Machines. Technical report,
HP Laboratories Israel, November 2003.
[50] M. Stone. Cross-validatory Choice and Assessment of Statistical Predictions. Journal
of the Royal Statistical Society, 36:111–147, 1974.
[51] V. N. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.
[52] J. J. Ward, L. J. McGuffin, B. F. Buxton, and D. T. Jones. Secondary Structure
Prediction with Support Vector Machine. Bioinformatics, 19:1650–1655, 2003.
[53] I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and
Techniques with Java Implementations. Morgan Kaufmann, San Francisco, CA, 1999.

QR CODE