簡易檢索 / 詳目顯示

研究生: 林己吉
Citra - Dwi Perkasa
論文名稱: A Study of Intrusion Detection System Using Support Vector Machines and Hierarchical Clustering
A Study of Intrusion Detection System Using Support Vector Machines and Hierarchical Clustering
指導教授: 洪西進
Shi-Jinn Horng
口試委員: 鍾國亮
Kuo-Liang Chung
梅興
Hsing Mei
王永鐘
Yung-Chung Wang
蘇民揚
Ming-Yang Su
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 40
外文關鍵詞: Clustering Feature, Clustering Feature Tree
相關次數: 點閱:256下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusions. There have been a lot of researches done to invent an ideal intrusion detection system (IDS) that is a system which can detect both known attacks and new attacks. Support vector machines (SVM) has been known as a promising methods for classification accuracy and its generalization ability. In this research, we design an SVM-based intrusion detection system which combines a hierarchical clustering algorithm, feature selection process and SVM classification techniques. The hierarchical clustering will provide SVM with a high quality training instances from the original training set. The feature selection process will eliminate unimportant features from the training set so that the model SVM produced can be used to classify the network traffic data accurately. Our experiments which use KDD Cup 1999 data set show that our method can achieve high accuracy classification rate with a low false positive rate.


Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusions. There have been a lot of researches done to invent an ideal intrusion detection system (IDS) that is a system which can detect both known attacks and new attacks. Support vector machines (SVM) has been known as a promising methods for classification accuracy and its generalization ability. In this research, we design an SVM-based intrusion detection system which combines a hierarchical clustering algorithm, feature selection process and SVM classification techniques. The hierarchical clustering will provide SVM with a high quality training instances from the original training set. The feature selection process will eliminate unimportant features from the training set so that the model SVM produced can be used to classify the network traffic data accurately. Our experiments which use KDD Cup 1999 data set show that our method can achieve high accuracy classification rate with a low false positive rate.

ABSTRACT i ACKNOWLEDGEMENTS ii TABLE OF CONTENTS iii LIST OF FIGURES v LIST OF TABLES vi CHAPTER I. Introduction 1 I.1. Overview of Network Intrusion Detection System 2 I.2. Using SVM as a Classification Technique 3 I.3 Related Work 6 I.4 Thesis Organization 7 CHAPTER II. Hierarchical Clustering and Support Vector Machines 8 II.1 Hierarchical Clustering 8 II.1.1 Clustering Feature 10 II.1.2 CF Tree 11 II.2 Support Vector Machines 16 CHAPTER III. SVM with Hierarchical Clustering 19 III.1 Data Transformation and Scaling 20 III.2 Construct a CF tree 21 III.3 Feature Selection 23 III.3.1 Methodology of Feature Selection 24 III.3.2 Performance metrics 24 CHAPTER IV. Experimental Result 26 IV.1. KDD Cup 1999 Data Set 26 IV.2. Experiment Setup 29 IV.3. Experimental Result 30 IV.4. Comparison with Other Intrusion Detection System 32 IV.5. Evaluation 33 CHAPTER V 36 REFERENCES 38

[1] Fayyad, U. M., G. Piatetsky-Saphiro, and P. Smyth. The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM 39(11), 27-34, November 1996.
[2] H. Yu, J. Yang, J. Han, and X. Li, Classifying Large Data Sets Using SVM with Hierarchical Clusters, In Proc 2003 Int. Conf. on Knowledge Discovery in Databases (KDD’03).
[3] T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: an efficient data clustering method for very large databases, In Proc. ACM SIGMOD Int. Conf. Management of Data (SIGMOD’96), pp. 103–114, 1996.
[4] S. Guha, R. Rastogi, and K. Shim. Cure: An efficient clustering algorithm for large databases. In Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’98), pages 73–84, Seattle, WA, June 1998.
[5] S. Guha, R. Rastogi, and K. Shim. ROCK: A robust clustering algorithm for categorical attributes. In Proc. 1999 Int. Conf. Data Engineering (ICDE’99), pages 512–521, Sydney, Australia, March 1999.
[6] G. Karypis, E.-H. Han, and V. Kumar. CHAMELEON: A hierarchical clustering algorithm using dynamic modeling. COMPUTER, 32:68–75, 1999.
[7] Vapnik, V. The Nature of Statistical Learning Theory. New York, NY: Springer-Verlag, 1995.
[8] Boser, B. E., I. Guyon, and V. Vapnik. A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144-152. ACM Press, 1992.
[9] A. Sundaram, “An introduction to intrusion detection”. ACM Cross Roads, vol. 2, no. 4, Apr. 1996.
[10] Hsu, C.-W., Chang, C.-C., and C.-J. Lin. A Practical Guide to Support Vector Classification, 2007.
[11] http://www.csie.ntu.edu.tw/~cjlin/libsvm
[12] KDD Cup 1999 Intrusion detection data set: http://kdd.ics.uci.edu/databases /kddcup99/kddcup99.html
[13] A. Abraham, C. Grosan, and C. Martin-Vide, “ Evolutionary Design of Intrusion Detection Programs“, International Journal of Network Security, Vol.4, No.3, pp 328-339, 2007.
[14] Y. Bouzida and F. Cuppens, “Neural Networks vs. Decision Trees for Intrusion Detection”, 2006.
[15] B. Pfahringer, Winning the KDD99 classification cup: bagged boosting, SIGKDD Explorations 1 (2) 65–66, 2000.
[16] I. Levin, KDD-99 classifier learning contest LLSoft’s results overview. SIGKDD explorations, ACM SIGKDD 1 (2) 67–75, 2000.
[17] Chimphlee W., Abdullah A. H., Md Sap M. N., Srinoy S., and Chimphlee S., “Anomaly-Based Intrusion Detection using Fuzzy Rough Clustering”. International Conference on Hybrid Information Technology (ICHIT'06), 2006.
[18] Toosi A. N., Kahani M., “A new approach to intrusion detection based on an evolutionary soft computing model using neuro-fuzzy classifiers”, Computer Communications 30, pp 2201–2212, 2007.
[19] Novikov D., Yampolskiy R. V., Reznik L.. “Anomaly Detection Based Intrusion Detection”, Proceedings of the Third International Conference on Information Technology: New Generations (ITNG'06), 2006.
[20] M.R. Sabhnani, G. Serpen, “Application of machine learning algorithms to KDD intrusion detection dataset within misuse detection context”, in: Proceedings of International Conference on Machine Learning: Models, Technologies, and Applications, 23–26 June 2003, Las Vegas, Nevada, USA, 2003, pp. 209–215.
[21] Xuren W., Famei H., Rongsheng X., “Modeling Intrusion DetectionSystem by Discovering Association Rule in Rough Set Theory Framework”. International Conference on Computational Intelligence for Modelling Control and Automation, and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06), 2006.

QR CODE