簡易檢索 / 詳目顯示

研究生: 許毓珊
Yu-Shan Hsu
論文名稱: 由決策樹與支撐向量機建構的入侵偵測系統
A Hybrid IDS framework via Decision Trees and SVMs
指導教授: 李育杰
Yuh-Jye Lee
口試委員: 鮑興國
none
項天瑞
none
吳怡樂
none
賴源正
none
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2006
畢業學年度: 94
語文別: 英文
論文頁數: 51
中文關鍵詞: 資料過濾器決策樹入侵偵測系統單類別支撐向量機平滑支撐向量機支撐向量機
外文關鍵詞: DARPA, data filter, one class support vector machine
相關次數: 點閱:218下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 由於網際網路資訊系統的蓬勃發展,目前各研究人員、政府機關與商業團體已開始密切注意網路攻擊行為所帶來的風險。入侵偵測系統是用來偵測電腦網路系統是否正遭受到一個不適當、不正確、或是異常的可疑入侵行為,然後對管理者發出警訊。在本論文中,我們提出了一個入侵偵測系統來辨別正常和異常行為,我們將此認定為分類問題來做處理。為了克服在訓練過程中欲處理大量資料集的困難,因此在系統第一階段中,我們使用決策樹來扮演資料過濾器的角色,利用決策樹所產生之高度正確率的決策來分類部分資料集,使之有效降低龐大資料集的數量。在系統第二階段中,我們使用單類別支撐向量機與平滑支撐向量機做為此階段的核心技術,繼續對決策樹無法分類的資料集做分類的處理。我們結合了單類別支撐向量機與平滑支撐向量機來架構出一套精心設計的架構以提供一個有效的入侵偵測。最後,將我們的系統使用1999 KDD比賽的資料集來評估,經過實驗後得知我們所提出的系統在入侵偵測上比當年的第一名略勝一籌;除此之外,我們的系統對於U2R與R2L這兩類攻擊的偵測率亦比一些知名的演算法來得高。


    When the Internet and networked systems become more widespread and advanced, researchers, government organizations and many commercial firms start to pay much attention to the higher risk of attacks. Intrusion detection system is used to detect inappropriate, incorrect, or anomalous activity in computer networks giving the administrator a voluntarily alarm if some suspect events occurred with computer system. In this thesis, we proposed an intrusion detection system (IDS) framework to discriminate between normal connection and intrusive activities. We treat it as a classification problem. Before starting classification process, for the difficulties of dealing with the huge dataset in the training process, we employed decision tree as our data filter in the first classification stage to reduce the dataset exactively. Decision tree will generate some rules which have higher accuracy so we can use them to classify some dataset. The rest data which are unclassified by decision tree will be used to continue the classification procedure as the second stage in our system. We use two expanded version of support vector machine (SVM), one class support vector machine (OCSVM) and smooth support vector machine (SSVM), as our core techniques of the second classification stage. We combine them to construct a more efficient and effective intrusion detection system. In order to evaluate our system, we use 1999 KDD intrusion detection dataset to test it. Our system have a slight advantage than 1999 KDD Cup winner in the whole accuracy while our system has better prediction rates toward U2R and R2L connections than other well-known algorithms.

    Contents 1 Introduction 1 1.1 Network Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Organization of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Overview of Intrusion Detection Systems 6 2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Category of IDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 Data Mining Technology for Classi‾cation 12 3.1 Decision Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Smooth Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . 16 3.4 One Class Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . 18 I 4 System Framework 20 4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 Architecture of Our System . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5 Experiments 26 5.1 Dataset Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.3 Training processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 6 Conclusions and Discussions 34

    Bibliography
    [1] N. Abouzakhar, A. Gani, G. Manson, M. Abuitbel, and D. King. Bayesian learning
    networks approach to cybercrime detection. In PostGraduate Networking Conference,
    2003.
    [2] J. Allen, A. Christie, W. Fithen, J. Pickel J. Mchugh, and E. Stoner. State of
    the practice of intrusion detection technologies. Technical report, Technical Report
    CMU/SEI-99-TR-028, Software Engineering Institute, CMU, Pittsburgh, PA, Feb-
    ruary 2000.
    [3] D. Anderson, T. Frivold, and A. Valdes. A next-generation intrusion detection expert
    system (NIDES). Technical report, SRI International, Computer Science Laboratory,
    May 1995.
    [4] J. P. Anderson. Computer security threat monitoring and surveillance. Technical
    report, James P Anderson Co., April 1980.
    [5] Tim Bass. Intrusion detection systems and multisensor data fusion. Communications
    of the ACM, 43(4):99{105, April 2000.
    [6] S. T. Brugger. Data mining methods for network intrusio detection, June 09 2004.
    University of California.
    [7] C. J. C. Burges. A tutorial on support vector machines for pattern recognition. Data
    Mining and Knowledge Discovery, 2(2):121{167, 1998.
    [8] C. Chen and O. L. Mangasarian. Smoothing methods for convex inequalities and
    linear complementarity problems. Mathematical Programming, 71(1):51{69, 1995.
    [9] C. Chen and O. L. Mangasarian. A class of smoothing functions for nonlinear and
    mixed complementarity problems. Computational Optimization and Applications,
    5(2):97{138, 1996.
    [10] V. Cherkassky and F. Mulier. Learning from Data - Concepts, Theory and Methods.
    John Wiley & Sons, New York, 1998.
    [11] R. Courant and D. Hilbert. Methods of Mathematical Physics. Interscience Publish-
    ers, New York, 1953.
    36
    [12] N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines.
    Cambridge University Press, Cambridge, 2000.
    [13] D. Denning. An intrusion detection model. IEEE Transactions on Software Engi-
    neering SE-13, February 1987.
    [14] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. From data mining to knowledge
    discovery in databases. Ai Magazine, 17:37{54, 1996.
    [15] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. The KDD process for extracting
    useful knowledge from volumes of data. Communications of the ACM, 39(11):27{34,
    November 1996.
    [16] D. P. Greene and S. F. Smith. Competition-based induction of decision models from
    examples. Machine Learning, 13:229{257, 1993.
    [17] G.Tandon and P. Chan. Learning rules from system call arguments and sequences
    for anomaly detection. In ICDM Workshop on Data Mining for Computer Security
    (DMSEC), pages 20{29, 2003.
    [18] R. Heady, G. Luger, A. Maccabe, and M. Servilla. The architecture of a network
    level intrusion detection system. Technical report, Computer Science Department,
    University of New Mexico, August 1990.
    [19] L. T. Heberlein, G. Dias, K. Levitt, B. Mukherjee, J. Wood, and D. Wolber. A net-
    work security monitor. In IEEE Computer Society Symposium, Research in Security
    and Privacy, pages 296{304, may 1990.
    [20] K. A. Heller, K. M. Svore, A. D. Keromytis, and S. J. Stolfo. One class support vector
    machine for detecting anomalous windows registry accesses. Dept. of Computer
    Science, Columbia University, 1214 Amsterdam Avenue, New York, NY 10025.
    [21] K. R. Hess, M. C. Abbruzzese, R. Lenzi, M. N. Raber, and J. L. Abbruzzese. Clas-
    si‾cation and regression tree analysis of 1000 consecutive patients with unknown
    primary carcinoma. Clinical Cancer Research, pages 3403{3410, 1999.
    [22] P. Innella. The evolution of intrusion detection systems, November 2001.
    http://www.securityfocus.com/infocus/1514.
    [23] T. Joachims. Learning to Classify Text Using Support Vector Machines: Methods,
    Theory, and Algorithms. Kluwer Academic Publishers, Dordrecht, The Netherlands,
    2002.
    [24] Y.-J. Lee and O. L. Mangasarian. SSVM: A smooth support vector machine. Compu-
    tational Optimization and Applications, 20:5{22, 2001. Data Mining Institute, Uni-
    versity of Wisconsin, Technical Report 99-03. ftp://ftp.cs.wisc.edu/pub/dmi/tech-
    reports/99-03.ps.
    [25] M. Mahoney. Computer security: A survey of attacks and defenses, 2000.
    http://www.cs.‾t.edu/ mmahoney/ids.html.
    37
    [26] O. L. Mangasarian. Mathematical programming in neural networks. ORSA Journal
    on Computing, 5(4):349{360, 1993.
    [27] O. L. Mangasarian and D. R. Musicant. Successive overrelaxation for support
    vector machines. IEEE Transactions on Neural Networks, 10:1032{1037, 1999.
    ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-18.ps.
    [28] D. J. Marchette. Computer Intrusion Detection and Network Monitoring: a statistical
    viewpoint. Springer, 2001.
    [29] S. Northcutt and J. Novak. Network Intrusion Detection. New Riders, third edition,
    2003.
    [30] J. R. Quinlan. Induction of decision trees. Machine Learning, 1:81{106, 1986.
    [31] D. Radcli®. The evolution of intrusion detection system, November 2004.
    http://www.networkworld.com/research/2004/110804ids.html.
    [32] M. Roesch. Snort | lightweight intrusion detection for networks. In Proceedings of
    the Thirteenth Systems Administration Conference (LISA XIII), Seattle, WA, USA,
    November 1999.
    [33] R. J. Roiger and M. W. Geatz. Data Mining, a tutorial-based primer. Addison
    Wesley, 2003.
    [34] M. Sabhnani and G. Serpen. An application of machine learning algorithms to KDD
    intrusion detection dataset within misuse detection context. In Proceedings of the In-
    ternational Conference on Machine Learning, Models, Technologies and Applications
    (MLMTA 2003), pages 209{215, 2003.
    [35] M. Sabhnani and G. Serpen. Why machine learning algorithms fail in misuse detec-
    tion on KDD intrusion detection data set. Intelligent Data Analysis, 8(4):403{415,
    2004.
    [36] B. SchÄolkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson. Esti-
    mating the support of a high-dimensional distribution. Neual Computation, 13:1443{
    1471, 2001.
    [37] B. SchÄolkpf and Alexander J. Smola. Learning with Kernels. Massachusetts Institute
    of Technology Press, 2002.
    [38] Q.-A. Tran, H. Duan, and X. Li. One-class support vector machine for anomaly
    network tra±c detection. China Education and Research Network (CERNET), Ts-
    inghua University, Main Building, 310 Beijing 100084, China.
    [39] M. D. Twa, S. Parthasarathy, T. W. Raasch, and M. A. Bullimore. Automated
    classi‾cation of keratoconus: A case study in analyzing clinical data, February 04
    2003.
    [40] V. N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, New York,
    1995.
    38
    [41] K. Wang and S. J. Stolf. One-class training for masquerade detection. In 3rd IEEE
    Conference Data Mining Workshop on Data Mining for Computer Security, Florida,
    2003. Computer Science Department, Columbia University, 500 West 120th Street,
    New York, NY, 10027.
    [42] W. Weber. Firewall basics. In 4th International Conference on Telecommunications
    in Modern Satellite, Cable and Broadcasting Services, TELSIKS 99, Proceedings of
    Papers, 1999.
    [43] Wikipedia. Intrusion-detection system. http://en.wikipedia.org/wiki/Intrusion-
    detection system.
    [44] R. Zalenski. Firewall technologies. IEEE Potentials, 2002.

    QR CODE