一個以資料探勘方式建立的三層式入侵偵測系統｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	黃聰松 Tsong - Song Hwang
論文名稱：	一個以資料探勘方式建立的三層式入侵偵測系統 A Three-tier IDS via Data Mining Approach
指導教授：	李育杰 Yuh-Jye Lee
口試委員:	鮑興國 none 項天瑞 none 吳怡樂 none 賴源正 none
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2006
畢業學年度：	94
語文別：	英文
論文頁數：	60
中文關鍵詞：	入侵偵測系統、黑名單、白名單、多類別支撐向量機、KDD’99 、RIPPER 、虛警率
外文關鍵詞：	blacklist, whitelist, KDD'99, multiclass SVMs, false alarm rate
相關次數：	點閱：237 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在本論文中, 我們利用黑名單/白名單以及多類別支撐向量機的觀念，以資料探勘的方法建立一個三層式的入侵偵測系統。在此，黑名單代表的是一系列已知攻擊的模式，白名單是一系列正常活動的行為表現，多類別支撐向量機則是將偵測到的入侵行為分類的方法。這種三層式的架構可以增進偵測的正確性 (入侵偵測率 94.71%)。我們採用KDD’99資料組來評估我們提出的方法。
我們用KDD’99提供的的10%訓練資料集，來訓練RIPPER規則學習系統，以得到一個包含已知攻擊模式的黑名單。這個黑名單要用整個訓練資料組，4,898,431筆資料中的3,925,650筆攻擊，來做驗證。結果，得到的黑名單在測試資料組中可以找到97.54%的已知攻擊與7.84%的未知攻擊。之後，我們將972,734筆正常的連結依據通訊協定與網路服務的種類分類成27種活動，並算出其範疇而得到我們的白名單，用每種活動的範疇來檢驗一個即將進入系統的連結是不是一個攻擊。我們的白名單在測試資料組中可以偵測到1.91%的已知攻擊和23.40%的未知攻擊。最後，利用多類別支撐向量機來對白名單偵測到的攻擊作分類。
此一三層式的入侵偵測系統的入侵偵測率與入侵診斷率分別是94.71%和93.52%，每個連結的平均扣分是0.1781，這些結果都比KDD’99的冠軍要好。另外，我們的虛警率也只有3.8%
我們的系統具有相當的彈性，黑名單允許網管人員依據需要加入新的攻擊模式到黑名單中，白名單則允許網管人員依據系統的實際狀況作微調。

In this thesis, we design a three-tier intrusion detection system,
IDS, via data mining approach based on the concept of
blacklist/whitelist and the method of multiclass SVMs. Blacklist
here stands for a list of patterns for known attacks, whitelist for
a list of behaviors of the normal activities and multiclass SVMs for
the method of categorizing the anomalies detected by the whitelist
into the four attack classes: PROBE, DoS, U2R and R2L. The
utilization of blacklist/whitelist can improve the performance of
detection, 94.71%. The KDD'99 benchmark dataset is employed to
examine the performance of the designed IDS.

RIPPER rule learning system is applied on the smaller training
dataset, 10%, to obtain the blacklist, a rule set containing the
patterns of old attacks. The blacklist is validated by the entire
huge training dataset, 3,925,650 attacks among 4,898,431 examples.
The blacklist can detect up to 97.54% of old attacks and 7.84% of
new attacks in test data set. Then the whitelist is obtained by
categorizing 972,734 normal connections into 27 activities,
according to features ``protocol_type" and ``service", and
calculating each activity's profile. The profile of each activity is
used to examine an incoming connection whether it is an attack or
not. The whitelist can detect 1.91% of old attacks and 23.40% of
new attacks in test dataset. Finally multiclass SVMs are used to
classify the attacks, detected by the whitelist, as their specific
categories.

The intrusion detection performance and intrusion diagnosis
performance of the three-tier IDS are respectively 94.71% and
93.52%. The average cost for each connection is 0.1781. These
results are all better than those of KDD'99 winner's. Our false
alarm rate is only 3.8%.

The blacklist performs flexibly, allowing the net manager to add new
patterns to the rule set, and the whitelist allows the net manager
to do fine tuning according to the situation of their system.

Contents
Introduction 1
1 Problem Statement and Our Approach . . . . . . . . . . . . . . . . . . . . 3
2 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Intrusion Detection Systems 6
1 Types of Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Network-based IDS and Host-based IDS . . . . . . . . . . . . . . . . . . . 9
3 Misuse Detector and Anomaly Detector . . . . . . . . . . . . . . . . . . . . 11
Data Mining Methods and Architecture of the Three-tier IDS 13
1 RIPPER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Activity Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3 Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 Conventional Support Vector Machines . . . . . . . . . . . . . . . . 17
3.2 Smooth Support Vector Machines . . . . . . . . . . . . . . . . . . . 22
4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Experiments 27
1 KDD’99 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2 Three-tier Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.1 Blacklist (MD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Whitelist (AD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3 SVMClassifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Conclusions and Future Work 47

                                

Bibliography
[1] http://en.wikipedia.org/wiki/Sql injection.
[2] http://en.wikipedia.org/wiki/Denial of service.
[3] http://www.cert.org/stats/cert stats.html.
[4] http://www.usda.gov/wps/portal/!ut/p/ s.7 0 A/
7 0 1OB?contentidonly=true&contentid=2006/06/0214.xml.
[5] http://kdd.ics.uci.edu//databases/kddcup99/kddcup99.html.
[6] http://kdd.ics.uci.edu//databases/kddcup99/task.html.
[7] http://www.tripwire.com/products/index.cfm.
[8] http://freshmeat.net/redir/swatch/10125/url homepage/swatch.
[9] http://www.netiq.com/products/sm/default.asp.
[10] http://www.snort.org.
[11] http://www.bro-ids.org/.
[12] Java. www.sun.com.
[13] Matlab. www.mathworks.com.
[14] I. Basicevic, M. Popovic, and V. Kovacevic. The use of distributed network-based IDS systems in detection of evasion attacks. In AICT/SAPIR/ELETE, pages 78–82.
IEEE Computer Society, 2005.
[15] P. O. Boykin and V. P. Roychowdhury. Personal email networks: An effective antispam
tool. CoRR, cond-mat/0402143, 2004.
[16] R. Bragg, M. Rhodes-Ousley, and K. Strassberg. Network Security V The Complete
Reference. Tata McGraw Hill, 2004.
[17] C. J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121–167, 1998.

[18] C. Chen and O. L. Mangasarian. Smoothing methods for convex inequalities and linear complementarity problems. Mathematical Programming, 71(1):51–69, 1995.
[19] C. Chen and O. L. Mangasarian. A class of smoothing functions for nonlinear and mixed complementarity problems. Computational Optimization and Applications,
5(2):97–138, 1996.
[20] V. Cherkassky and F. Mulier. Learning from Data - Concepts, Theory and Methods. John Wiley & Sons, New York, 1998.
[21] W. W. Cohen. Fast effective rule induction. In ICML, pages 115–123, 1995.
[22] R. Courant and D. Hilbert. Methods of Mathematical Physics. Interscience Publishers,New York, 1953.
[23] N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines.Cambridge University Press, Cambridge, 2000.
[24] E. Eskin, A. Arnold, M. Prerau, L. Portnoy, and S. Stolfo. A geometric framework for
unsupervised anomaly detection: Detecting intrusions in unlabeled data, January 17 2002.
[25] T. Evgeniou, M. Pontil, and T. Poggio. Regularization networks and support vector machines. In A. Smola, P. Bartlett, B. Sch¨olkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 171–203, Cambridge, MA, 2000. MIT Press.
[26] R. Fletcher. Practical Methods of Optimization. wiley, Chichester, second edition, 1987.
[27] E. Frank, M. A. Hall, G. Holmes, R. Kirkby, and B. Pfahringer. WEKA - A machine learning workbench for data mining. In Oded Maimon and Lior Rokach, editors, The Data Mining and Knowledge Discovery Handbook, pages 1305–1314. Springer, 2005.
[28] J. F¨urnkranz and G. Widmer. Incremental reduced error pruning. In ICML, pages 70–77, 1994.
[29] K. A. Heller, K. M. Svore, A. D. Keromytis, and S. J. Stolfo. One class support vector machine for detecting anomalous windows registry. Dept. of Computer Science, Columbia University, 1214 Amsterdam Avenue, New York, NY 10025.
[30] C.-M. Huang, Y.-J. Lee, D. K. J. Lin, and S.-Y. Huang. Model selection for support vector machines via uniform design. The special issue on Machine Learning and Robust Data Mining of Computational Statistics and Data Analysis, 2006.
[31] V. Jacobson and et al. TCPDUMP(1), BPF... Unix Manual Page, 1990.
[32] H. S. Javitz and A. Valdes. The SRI IDES statistical anomaly detector. In Proc. IEEE Symposium on Research in Security and Privacy, pages 316–326, 1991.

[33] T. Joachims. Learning to Classify Text Using Support Vector Machines: Methods, Theory, and Algorithms. Kluwer Academic Publishers, Dordrecht, The Netherlands, 2002.
[34] K. Kendall. A database of computer attacks for the evaluation of intrusion detection systems. Master’s thesis, Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.
[35] W. Lee. A Data Mining Framework for Constructing Features and Models for Intrusion Detection Systems. PhD thesis, Columbia University, 1999.
[36] W. Lee, S. Stolfo, P. Chan, E. Eskin, W. Fan, M. Miller, S. Hershkop, and J. Zhang. Real time data mining-based intrusion detection, 2001.
[37] Y.-J. Lee and O. L. Mangasarian. SSVM: A smooth support vector machine. Computational Optimization and Applications, 20:5–22, 2001. Data Mining Institute, University of Wisconsin, Technical Report 99-03. ftp://ftp.cs.wisc.edu/pub/dmi/techreports/
99-03.ps.
[38] R. Lippmann, D. Fried, I. Graf, J. Haines, K. Kendall, D. McClung, D. Weber, S. Webster, D. Wyschogrod, R. Cunningham, and M. Zissman. Evaluating intrusion detection systems: The 1998 DARPA off-line intrusion detection evaluation. In Proceedings of the DARPA Information Survivability Conference and Exposition, Los Alamitos, CA, 2000. IEEE Computer Society Press.
[39] R. Lippmann, J. W. Haines, D. J. Fried, J. Korba, and K. Das. The 1999 DARPA off-line intrusion detection evaluation. Computer Networks, 34(4):579–595, 2000.
[40] M. V. Mahoney. A Machine Learning Approach to Detecting Attacks by Identifying Anomalies in Network Traffic. PhD thesis, College of Engineering, Florida Institute of Technology, 2003.
[41] O. L. Mangasarian. Mathematical programming in neural networks. ORSA Journal on Computing, 5(4):349–360, 1993.
[42] O. L. Mangasarian and D. R. Musicant. Successive overrelaxation for support vector machines. IEEE Transactions on Neural Networks, 10:1032–1037, 1999.
ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-18.ps.
[43] D. J. Marchette. Computer Intrusion Detection and Network Monitoring: A Statistical Viewpoint. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2001.
[44] E. Osuna, R. Freund, and F. Girosi. Training support vector machines: An application to face detection. In IEEE Conference on Computer Vision and Pattern Recognition, pages 130–136, 1997.
[45] H.-J. Park and S.-B. Cho. Privilege flows modeling for effective intrusion detection
based on HMM, 2002.
[46] B. Pfahringer. Winning the KDD99 classification cup: Bagged boosting. SIGKDD Explorations, 1(2):65–66, 2000.

[47] H. Duan Q.-A. Tran and X. Li. One-class support vector machine for anomaly network traffic detection. China Education and Research Network (CERNET), Tsinghua University, Main Building, 310 Beijing 100084, China.
[48] S. E. Smaha. Tools for misuse detection. In Proceedings of ISSA ’93, Crystal City, VA, April 1993.
[49] S. J. Stolfo, S. Hershkop, C.-W. Hu, W.-J. Li, O. Nimeskern, and K.Wang. Behaviorbased modeling and its application to email analysis. ACM Transactions on Internet Technology (TOIT), 2006.
[50] S. J. Stolfo, S. Hershkop, K. Wang, O. Nimeskern, and C.-W. Hu. A behavior-based approach to securing email systems. In MMMACNS: International Workshop on Methods, Models and Architectures for Network Security, LNCS, 2003.
[51] V. N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, New York, 1995.
[52] V. N. Vapnik. Statistical Learning Theory. John Wiley & Sons, New York, 1998.
[53] D. Wagner and P. Soto. Mimicry attacks on host-based intrusion detection systems.
In Ravi Sandhu, editor, Proceedings of the 9th ACM Conference on Computer and
Communications Security, Washington, DC, USA, November 2002. ACM Press.
[54] K. Wang and S. J. Stolfo. One class support vector machine for detecting anomalous
windows registry. Computer Science Department, Columbia University, 500 West
120th Street, New York, NY, 10027.
[55] Wikipedia. Anti-virus software. http://en.wikipedia.org/wiki/Antivirus software.
[56] Wikipedia. Intrusion-detection system. http://en.wikipedia.org/wiki/Intrusiondetection
system.
[57] L.-K. Yang. A cascading intrusion detection framework using ocsvm and ssvm. Master’s
thesis, National Taiwan University of Science and Technology, 2005.
[58] R. Zalenski. Firewall technologies. IEEE Potentials, 2002.

簡易檢索 / 詳目顯示

相關論文