研究生: 許毓珊
Yu-Shan Hsu
論文名稱: 由決策樹與支撐向量機建構的入侵偵測系統
A Hybrid IDS framework via Decision Trees and SVMs
指導教授: 李育杰
Yuh-Jye Lee
口試委員: 鮑興國
學位類別: 碩士
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2006
畢業學年度: 94
語文別: 英文
論文頁數: 51
中文關鍵詞: 資料過濾器決策樹入侵偵測系統單類別支撐向量機平滑支撐向量機支撐向量機
外文關鍵詞: DARPA, data filter, one class support vector machine
由於網際網路資訊系統的蓬勃發展,目前各研究人員、政府機關與商業團體已開始密切注意網路攻擊行為所帶來的風險。入侵偵測系統是用來偵測電腦網路系統是否正遭受到一個不適當、不正確、或是異常的可疑入侵行為,然後對管理者發出警訊。在本論文中,我們提出了一個入侵偵測系統來辨別正常和異常行為,我們將此認定為分類問題來做處理。為了克服在訓練過程中欲處理大量資料集的困難,因此在系統第一階段中,我們使用決策樹來扮演資料過濾器的角色,利用決策樹所產生之高度正確率的決策來分類部分資料集,使之有效降低龐大資料集的數量。在系統第二階段中,我們使用單類別支撐向量機與平滑支撐向量機做為此階段的核心技術,繼續對決策樹無法分類的資料集做分類的處理。我們結合了單類別支撐向量機與平滑支撐向量機來架構出一套精心設計的架構以提供一個有效的入侵偵測。最後,將我們的系統使用1999 KDD比賽的資料集來評估,經過實驗後得知我們所提出的系統在入侵偵測上比當年的第一名略勝一籌;除此之外,我們的系統對於U2R與R2L這兩類攻擊的偵測率亦比一些知名的演算法來得高。

When the Internet and networked systems become more widespread and advanced, researchers, government organizations and many commercial firms start to pay much attention to the higher risk of attacks. Intrusion detection system is used to detect inappropriate, incorrect, or anomalous activity in computer networks giving the administrator a voluntarily alarm if some suspect events occurred with computer system. In this thesis, we proposed an intrusion detection system (IDS) framework to discriminate between normal connection and intrusive activities. We treat it as a classification problem. Before starting classification process, for the difficulties of dealing with the huge dataset in the training process, we employed decision tree as our data filter in the first classification stage to reduce the dataset exactively. Decision tree will generate some rules which have higher accuracy so we can use them to classify some dataset. The rest data which are unclassified by decision tree will be used to continue the classification procedure as the second stage in our system. We use two expanded version of support vector machine (SVM), one class support vector machine (OCSVM) and smooth support vector machine (SSVM), as our core techniques of the second classification stage. We combine them to construct a more efficient and effective intrusion detection system. In order to evaluate our system, we use 1999 KDD intrusion detection dataset to test it. Our system have a slight advantage than 1999 KDD Cup winner in the whole accuracy while our system has better prediction rates toward U2R and R2L connections than other well-known algorithms.

Contents 1 Introduction 1 1.1 Network Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Organization of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Overview of Intrusion Detection Systems 6 2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Category of IDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 Data Mining Technology for Classi‾cation 12 3.1 Decision Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Smooth Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . 16 3.4 One Class Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . 18 I 4 System Framework 20 4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 Architecture of Our System . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5 Experiments 26 5.1 Dataset Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.3 Training processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 6 Conclusions and Discussions 34

