簡易檢索 / 詳目顯示

研究生: 林恆生
Heng-Sheng Lin
論文名稱: 利用適應性遞增式學習演算法及群聚演算法降低入侵偵測虛警率
Incremental Adaptive Learning and Alert Grouping for False Alarm Reduction in Intrusion Detection
指導教授: 李漢銘
Hahn-Ming Lee
口試委員: 賴溪松
Chi-Sung Laih
郭耀煌
Yau-Hwang Kuo
林豐澤
Feng-Tse Lin
鮑興國
Hsing-Kuo Kenneth Pao
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2007
畢業學年度: 95
語文別: 英文
論文頁數: 66
中文關鍵詞: 組合式分類器串流資料入侵偵測虛警報機器學習群聚分析
外文關鍵詞: Ensemble classification, streaming data, intrusion detection, false positives, machine learning, cluster analysis
相關次數: 點閱:218下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著網路在商業上、工業上、政府、法人組織甚至是個人社群的廣泛及多樣化的應用,各式各樣發展成熟的攻擊手法意圖去癱瘓這些網路服務或是取得機要的資訊。因此大量部署入侵偵測系統(IDS)也成為組織網路裡最基本的防護解決方案,而也隨著入侵偵測系統的經年發展,另外也產生了額外的效應,即是大量假警報的問題。這問題也致使得網路分析人員或管理者總是必須花費額外的時間去從大量無幫助的虛警報訊號中去找出真實的警報,因此我們的研究貢獻是利用資料探勘即適應性學習方式來分析原生的警報,提供分析人員被組織過的資訊,包含預分類資訊、相似性群聚警報資訊、以及攻擊特徵法則的統計排序列表來幫助分析人員去快速得知有意義的警訊,及幫助重新調整他們的入侵偵測系統。針對這些需求,我們提出了在線系統上的兩個演算法分別去處理分類資訊及相似群聚的問題。第一個演算法適應性遞增式概念學習演算法 (IACL, Incremental Adaptive Concept Learning)。它是用來提供預先分類 警報資訊成真或假警報,該演算法能夠去遞增學習新的知識即適應新的行為改變,此外該演算法也是連續性學習演算法,意指它底層的學習模型只需要根據新的範例學習新的知識而不需要在每次學習時都必須回到原始的狀態重新學習,這也是比較符合實際運作系統的需求。另一個演算法是線上群聚演算法 (OAG, On-line Alert Grouping)。 該法被設計來針對入侵偵測系統經常性的發佈重覆 或相似的警報,而往往這些警報只代表著單一的惡意攻擊行為,因此分析人員能夠根據該被群聚的資訊而去取得的較具區隔的事件警報。上述演算法也在實際的實驗上在預測的精確率及平均精確率都有不錯的結果,我們也可以確信尤其是平均精確率更是相對的重要在一個穩定的運作系統上來說。


    As applications relying on network become increasingly diverse in commerce, governments, organizations and social network communities, attempts to compromise those services or steal sensitive information have become increasingly sophisticated. Consequently, Intrusion Detection Systems (IDSs) have been adopted as an essential protection method. However, IDSs have many side effects, particularly the large number of false alarms, which cause irrelevant information covering relevant alarms. Hence, the analysts and network administrators waste considerable time discovering relevant alarms. This study presents a system for providing organized information, including the predicted class, which labels an alarm into relevant or irrelevant one; the group information, which represents a single event, grouping those redundant or similar alarms, and useless statistical information, a rank list of statistic of valueless signatures which helps analysts tuning the rules of their signature-based IDS. Additionally, two algorithms related to machine learning and data mining are proposed in our system. The first one is the Incremental Adaptive Concept Learning (IACL) algorithm, which is adopted to train the committee classifier that categorizes the incoming alarms as relevant and irrelevant. Capable of incrementally learning new knowledge and adapting to changing target concepts, the algorithm is a continuously learning method, meaning that the model is trained by recently collected data without considering entirely accumulative data; this approach is more practical in on-line operation than ideal case, re-training underlying model with entire accumulation of recorded data in each time of invoking learning. The second algorithm, On-line Alert Grouping (OAG) algorithm, is designed to reduce the amount of redundant alarm information by grouping the similar or repetitive alarms into a single alarm group referring to a single event. Moreover, experimental results demonstrate that our IACL algorithm performs better in terms of accuracy and resources than combining all of trained models and only keeping the last learned model after each invoked learning process. In particular, the proposed learning model has a better average accuracy than others tested, revealing that it has better stability. Finally, On-line operation requirements, such as limited resources, are also considered.

    Abstract III Acknowledgements VI Content VII List of Tables X List of Figures XI Chapter 1 Introduction 1 1.1 Motivation 1 1.2 False Alarm Problem in Intrusion Detection System (IDS) 2 1.3 Goals 9 1.4 Outline of the Thesis 11 Chapter 2 Background 12 2.1 Baseline: Classification by Committee 13 2.2 Incremental Learning and Adaptive Learning 14 Chapter 3 False Alarm Reduction Using IACL and OAG algorithm 19 3.1 System Overview 19 3.1.1 Preprocess 21 3.1.2 Learning Process 23 3.1.3 Information Organization Process 24 3.1.4 User Operation Interface 25 3.1.5 System Flow of False Alarm Reduction 26 3.2 Incremental Adaptive Concept Learning 28 3.2.1 Complementary Learning Principle 31 3.2.2 Fitting Forgetting Principle 32 3.2.3 Validated Adapting Principle 35 3.2.4 Majority Voting Principle 36 3.3 Alert Grouping for Reducing Redundancy 36 Chapter 4 Experimental Evaluation 42 4.1 Experimental Data 43 4.2 Experimental Setup 43 4.3 Experimental Results 44 Chapter 5 Conclusion and Further Work 51 5.1 Discussion 51 5.2 Conclusion 56 5.3 Further Work 57 References 61 Vita 66

    [1] A. Alharbt and H. Imai, “IDS False Alarm Reduction Using Continuous and Discontinuous Patterns,” in Proc. of the 3th International conf. on Applied Cryptography and Network Security (ACNS 2005), 2005, pp. 192-205.
    [2] J. Allen, A. Christie, W. Fithen, J. McHugh, J. Pickel and E. Stoner, “State of the Practice of Intrusion Detection Technologies,” Software Engineering Institute of Carnegie Mellon University, PA, USA, Tech. Rep., Jan. 2000.
    [3] D. Bolzoni and S. Etalle, “APHRODITE: an Anomaly-based Architecture for False Positives Reduction,” University of Twente, Netherlands, Tech. Rep. TR-CTIT-06-13, 2006.
    [4] G. A. Carpenter and S. Grossberg, “The ART of Adaptive Pattern Recognition by a Self-Organizing Neural Network,” Computer, vol. 21, no. 3, pp. 77-88, 1988.
    [5] G. A. Carpenter, S. Grossberg, and J. H. Reynolds, “ARTMAP: Supervised real-time learning and classification of nonstationary data by a self organizing neural network,” Neural Networks, vol. 4 no.5, pp. 565–588, 1991.
    [6] F. Chu and C. Zaniolo, “Fast and Light Boosting for Adaptive Mining of Data Streams,” in Proc. of the 8th Pacific-Asia Conf. on Knowledge Discovery and Data mining (PAKDD 2004), 2004, pp. 282-292.
    [7] O. Dain and R. K. Cunningham, “Fusing a heterogeneous alert stream into scenarios,” in Proc. of the 8th ACM Conf. on Computer and Communications Security (CCS), Philadelphia, PA, 2001, pp. 1-13.
    [8] W. Fan, S. J. Stolfo, and J. Zhang, “The application of AdaBoost for distributed, scalable and on-line learning,” in Proc. of the 5th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 1999, pp. 362-366.
    [9] A. Fern and R. Givan, “Online ensemble learning: An empirical study,” Machine Learning, vol. 53, no. 1, pp. 71-109, 2003.
    [10] F. Ferrer-Troyano, J. S. Aguilar-Ruiz and J. C. Riquelme, “Data streams classification by incremental rule learning with parameterized generalization,” in Proc. of the 2006 ACM symposium on Applied computing, 2006, pp. 657-661.
    [11] E. Frank, G. Holmes, R. Kirkby, and M. Hall, “Racing Committees for Large Datasets,” in Proc. of the 5th International Conf. on Discovery Science, 2002, pp. 153-164.
    [12] Y. Freund and R. E. Schapire, “Experiments with a New Boosting Algorithm,” in Proc. of the 13th International Conf. on Machine Learning, 1996, pp. 148-156.
    [13] Y. Freund and R. E. Schapire, “A decision theoretic generalization of on-line learning and an application to boosting,” Computer System Science, vol. 57, no. 1, pp. 119–139, 1997.
    [14] K. Julisch, “Clustering Intrusion Detection Alarms to Support Root Cause Analysis,” ACM Trans. on Information and System Security (TISSEC), vol. 6, no. 4, pp. 443-471, 2003.
    [15] T. Kidera, S. Ozawa and S. Abe, “An Incremental Learning Algorithm of Ensemble Classifier Systems,” in Proc. of the International Joint Conf. on Neural Networks (IJCNN ’06), BC, Canada, 2006, pp. 3421- 3427.
    [16] J.Z. Kolter and M.A Maloof, “Dynamic weighted majority: a new ensemble method for tracking concept drift,” in Proc. of the 3rd IEEE International Conf. on Data Mining ICDM-2003, 2003, pp. 123-130.
    [17] K. H. Law and L. F. Kwok, “IDS False Alarm Filtering Using KNN Classifier,” in Proc. of the 5th International Workshop on Information Security Applications (WISA 2004), 2004, pp. 114-121.
    [18] W. Lee, S.J. Stolfo and K.W. Mok, “Adaptive intrusion detection: a data mining approach,” Artificial Intelligence Review, vol. 14, no. 6, pp. 533-567, 2000.
    [19] M.V. Mahoney and P.K. Chan, “An Analysis of the 1999 DARPA/Lincoln Laboratory Evaluation Data for Network Anomaly Detection,” in Proc. of the 6th International Symposium on Recent Advances in Intrusion Detection (RAID 2003), 2003, pp. 220-237.
    [20] Marcus A. Maloof and Ryszard S. Michalski, “Incremental learning with partial instance memory,” Artificial Intelligence, vol. 154, no. 1-2, pp. 95-126, 2004.
    [21] S. Manganaris, M. Christensen, D. Zerkle and K. Hermiz, “A Data Mining Analysis of RTID Alarms,” The International Journal of Computer and Telecommunications Networking, vol. 34, no. 4, pp. 571–577, 2000.
    [22] Y.L. Murphey, Z. Chen and L. Feldkamp, “Incremental neural learning using AdaBoost,” in Proc. of the International Joint Conf. on Neural Networks (IJCNN '02), Hawaii, USA, 2002, pp. 2304-2308.
    [23] D.A. Nembhard and N. Osothsilp, “An empirical comparison of forgetting models,” IEEE Trans. on Engineering Management, vol. 48, no. 3, pp. 283-291, 2001.
    [24] P. Ning, Y. Cui, D. S. Reeves and D. Xu, “Techniques and tools for analyzing intrusion alerts,” ACM Trans. on Information and System Security (TISSEC), vol. 7, no. 2, pp. 274-318, 2004.
    [25] T. Pietraszek, “Using adaptive alert classification to reduce false positives in intrusion detection,” in Proc. of the 7th International Symposium on Recent Advances in Intrusion Detection (RAID 2004), 2004, pp. 102-124.
    [26] R. Polikar, L. Upda, S. S. Upda, and V. Honavar, “Learn++: An Incremental Learning Algorithm for Supervised Neural Networks,” IEEE Trans. on Systems, Man and Cybernetics, Part C, vol. 31, no. 4, pp. 497-508, 2001.
    [27] Y. Qiao and X. Weixin, “A Network IDS with Low False Positive Rate,” in Proc. of the IEEE congress on Evolutionary Computation (CEC 2002), 2002, pp. 1121-1126.
    [28] J. R. Quinlan, “Bagging, Boosting, and C4.5,” in Proc. of the 13th National Conf. on Artificial Intelligence, 1996, pp. 725-730.
    [29] M. Roesch, “Snort—Lightweight Intrusion Detection for Networks,” in Proc. of the 13th Large Installation System Administration Conf. (USENIX LISA ’99), 1999, pp. 229-238.
    [30] R. E. Schapire,Y. Freund, P. Bartlett, and W. S. Lee, “Boosting the margins: A new explanation for the effectiveness of voting methods,” The Annals of Statistics, vol. 26, no. 5, pp. 1651–1686, 1998.
    [31] R. E. Schapire and Y. Singer, “Improved Boosting Algorithms Using Confidence-rated Predictions,” Machine Learning, vol. 37, no. 3, pp. 297-336, 1999.
    [32] W. Street and Y. Kim, “A streaming ensemble algorithm (SEA) for large-scale classification,” in Proc. of the 7th ACM SIGKDD International Conf. on Knowledge Discovery and Data Mining KDD-2001, 2001, pp. 377-382.
    [33] H. Wang, W. Fan, P.S. Yu and J. Han, “Mining concept-drifting data streams using ensemble classifiers,” in Proc. of the 9th ACM SIGKDD International Conf. on Knowledge Discovery and Data Mining KDD-2003, 2003, pp. 226-235.
    [34] G. Widmer and M. Kubat, “Learning in the presence of concept drift and hidden contexts,” Machine Learning, vol. 23, no. 1, pp. 69-101, 1996.
    [35] I. Witten and E. Frank, Data Mining - Practical Machine Learning Tools and Techniques with Java Implementations, 2nd ed., J. Gray, Ed. CA: Morgan Kaufmann, 2005.
    [36] F. Valeur, G. Vigna, C. Kruegel and R.A. Kemmerer, “Comprehensive approach to intrusion detection alert correlation,” IEEE Trans. on Dependable and Secure Computing, vol. 1, no. 3, pp. 146-169, 2004.
    [37] B. Zhu and A. A. Ghorbani, “Alert Correlation for Extracting Attack Strategies,” International Journal of Network security, vol. 3, no. 3, pp. 224-258, 2006.
    [38] Basic Analysis and Security Engine (BASE) project, http://base.secureideas.net/index.php

    QR CODE