簡易檢索 / 詳目顯示

研究生: 毛敬豪
Ching-Hao Mao
論文名稱: 以序列為基礎之網路異常事件分析與攻擊行為偵測
Sequence-based Anomaly Detection for Analyzing and Identifying Malicious Network Behavior
指導教授: 李漢銘
Hahn-Ming Lee
鮑興國
Hsing-Kuo Pao
口試委員: 蔡益坤
Yih-Kuen Tsay
李育杰
Yuh-Jye Lee
林豐澤
Feng-Tse Lin
鄭博仁
Bor-Ren Jeng
田筱榮
Hsiao-Rong Tyan
學位類別: 博士
Doctor
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2010
畢業學年度: 99
語文別: 英文
論文頁數: 110
中文關鍵詞: 異常偵測相似度警報關聯馬可夫模型
外文關鍵詞: Isomap
相關次數: 點閱:364下載:25
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 相同意圖與目的的惡意行為往往具有不同的序列組合,攻擊行為的變異提高了偵測的困難性。攻擊序列之變異起源於:(1) 具因果關係的關聯的多樣化呈現方式;(2)雜訊的插入以及;(3) 不同行為的交錯穿插。這些問題造成現今入侵偵測系統在捕捉具因果關係之惡意行為時,無法正確關聯並建構出攻擊場景 並產生大量的假警報(false alarm)。此問題更造成對於專家與分析人力的高度需求,以分析入侵偵測系統所發動的警報事件。
    在本論文,我們提出一個以序列為基礎之攻擊偵測機制 (Sequence-based Anomaly Detection, SBAD)以找出具因果關係之惡意行為。在這機制中,我們先提出以主機拓樸網路為基礎之行為萃取機制,可再不需任何參數與預先知識下,自動萃取行為序列;接著,透過機率圖型模型-馬可夫鏈模型-對於各個行為序列進行描述,夠過多步驟滑動視窗之二元語法關聯方法(n-step-window-based bi-gram),可容忍不同攻擊於序列呈現時所造成的變異、行為交錯以及長片段行為插入。本研究之核心在於透過最小描述長度(minimal description length)的編碼方式,提出一個新的序列行為距離量測機制,較大行為距離則代表較不相似行為序列,反之亦然。最後,我們透過流形學習(manifold learning)的分析方式,從大量的特徵中萃取具有意義的特徵,並可提供視覺化的介面,並可搭配各種分類與學習演算機制,找出具可疑或是惡意的行為序列。
    我們所提出的SBAD具有以下貢獻:(1) 提出不需預先知識以及參數之行為序列萃取機制;(2) 提出一個新的事件序列差異度評估機制,這機制不僅可評估行為串列之間的差異度,更可評估序列與圖形機率模型之間的差異度;(3) 藉由實際的資料驗證,我們所提出的方法可有效率的處理具雜訊、行為穿插以及長片段的插入的行為序列等;(4) 我們的方法的運算時間複雜度與所處理的警報各數呈線性成長的關係;(5) 透過視覺化的機制(Isomap)可在低維度的空間呈現正常與異常行為的分布。
    本研究透過公開的資料集、非公開的資料集以及真實世界的資料進行評估所提出的方法。評估的效果顯示所提出的SBAD可找出惡意的行為於巨大量雜訊的環境。而透過圖形結構以及流形學習的呈現,可自然的提供視覺化的結果並讓資安研究或營運人員做進一步的分析。


    Malicious behaviors with similar intent or purpose often possess different data sequence patterns which increase the difficulty in malicious behavior identification. These data sequence variations originated from the following three sources: (1) the multiplicity of causal relationships appearing in data sequence patterns, (2) the injection of noises in the attack sequence, and (3) the interwoven of various malicious behaviors. These problems make the current intrusion detection systems perform poorly in capturing the causal relation of malicious behaviors, or precisely correlating their relationship and correctly reconstructing the attack scenario, and consequently producing considerable false alarms. Furthermore, these problems caused the analysis of the abnormal events flagged by the intrusion detection system heavily dependent on the security experts and analytical man-power.

    In this thesis, we propose a Sequence-based Anomaly Detection (SBAD) mechanism for identifying malicious behaviors with causal relationship. In this mechanism, we first proposed a behavior thread extraction mechanism based on host topologies which could automatically extract behavior sequence without any parameter configuration and prior knowledge. Then we described each behavior sequence using a probabilistic graph-based model and the “n-step-window-based bi-gram” correlation method, which could tolerate large variances, resulted from different data sequences manifested by different malicious attacks, event interleaving and long interjections. Using these models, we proposed a novel “sequence dissimilarity measure” which leverages the concept of “minimum description length” encoding method to maximize the dissimilarity between different groups of events. A large dissimilarity measure implies different behaviors between the two groups of events, while a small dissimilarity measure implies the opposite meaning. Finally, the “manifold learning analytical method” is applied to the extracting of the meaningful features from a large quantity of features via visualized interfaces which combined either with regular classification methods or with active learning methods to identify suspicious or malicious behavior sequences.

    The proposed SBAD has made the following contributions: (1) A proposal for a behavior thread extraction mechanism without any parameter configuration and a priori knowledge; (2) A novel dissimilarity measure for event sequences. Under this measure, not just sequence to sequence, but also sequence to template, or template to template can be compared to each other using a dissimilarity measure; (3) Justified by real case study, an event analyzer which can effectively deal with event sequences in presence of noise, interleaving, interjection (long insertion) or interception contained sequences, sequences with missing alerts, etc; (5) A visualization mechanism (Isomap) which facilitates the visual understanding of the distribution between malicious or normal behavior of event sequences in low dimension space.

    We evaluated our proposed methods using different sources of dataset (e.g., public data, semi-public data and real world private data). The evaluation results show that the proposed SBAD method could identify various malicious sequence patterns under abundant noisy environment. Moreover, through the graphical structures and the representation from the “manifold learning method” the visualized result could be provided naturally for the further study or analysis by either security experts or operation personnel.

    1 Introduction 1 1.1 Problem Statement and Our Approach . . . . . . . . . . . . . . . . . 5 1.2 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3 Outlines of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . 12 2 Background and Related Work 13 2.1 Network Intrusion Detection System . . . . . . . . . . . . . . . . . . 14 2.1.1 Detection Philosophy . . . . . . . . . . . . . . . . . . . . . . 14 2.1.2 Intrusion Detection in Network Payload . . . . . . . . . . . . 16 2.1.3 Intrusion Detection in Web Applications . . . . . . . . . . . . 17 2.2 Event Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 iiiCONTENTS iv 2.2.1 Graphical Alert Representation . . . . . . . . . . . . . . . . . 19 2.2.2 Alert Correlation . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.3 False Alarm Reduction . . . . . . . . . . . . . . . . . . . . . 23 2.3 Sequence Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.1 Sequence Alignment and Event Mining . . . . . . . . . . . . 24 2.3.2 Dissimilarity Measure . . . . . . . . . . . . . . . . . . . . . 26 3 Sequential-based Anomaly Detection 28 3.1 Correlation Graph Constructor . . . . . . . . . . . . . . . . . . . . . 31 3.1.1 Behavior Thread Extraction . . . . . . . . . . . . . . . . . . 32 3.1.2 Behavior Transition Parameter Estimation . . . . . . . . . . . 33 3.2 Manifold Learning-based Causal Relations Builder . . . . . . . . . . 35 3.2.1 Behavior Dissimilarity Measure . . . . . . . . . . . . . . . . 36 3.2.2 Causal Relations Behavior Representation . . . . . . . . . . . 39 3.2.3 Incremental SBAD . . . . . . . . . . . . . . . . . . . . . . . 41 3.3 Malicious Behavior Identification and Inspectation . . . . . . . . . . 42 3.3.1 Malicious Behavior Detection . . . . . . . . . . . . . . . . . 42CONTENTS v 4 Experiment Results 45 4.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.1.1 iCAST/Acer 2007 . . . . . . . . . . . . . . . . . . . . . . . 45 True Attacks and False Alarms . . . . . . . . . . . . . . . . . 47 Attack Scenario . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.1.2 DARPA 99 . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.1.3 PKDD 2007 . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.1.4 CHT2009 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.2 Obfuscated JavaScript Dataset . . . . . . . . . . . . . . . . . . . . . 52 4.3 Model Parameters and Evaluation Metrics . . . . . . . . . . . . . . . 54 4.3.1 Model Parameters . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . 55 4.3.3 Sensitivity Analysis and Robustness . . . . . . . . . . . . . . 56 4.4 Effectiveness Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.4.1 Identifying Intrusive Patterns . . . . . . . . . . . . . . . . . . 61 4.4.2 Dealing with non-typical patterns . . . . . . . . . . . . . . . 64 4.5 Application to Alert Analysis . . . . . . . . . . . . . . . . . . . . . . 65CONTENTS vi 4.5.1 Identifying Potential Attack Sequences . . . . . . . . . . . . 66 4.5.2 Malicious Knowledge Base and Indexing . . . . . . . . . . . 67 4.6 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.7 Performance in Real Environment . . . . . . . . . . . . . . . . . . . 68 4.7.1 Efficiency Analysis . . . . . . . . . . . . . . . . . . . . . . . 69 4.7.2 Real Case Analysis . . . . . . . . . . . . . . . . . . . . . . . 70 5 Conclusions and Further Work 78 5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.2 Dissertation Contributions . . . . . . . . . . . . . . . . . . . . . . . 79 5.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.4 Closed Remark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

    [1] “DARPA intrusion detection evaluation,” 1999, http://www.ll.mit.edu/mission/communications/ist/corpora/ideval/data/index.html.
    [2] R. Agrawal, T. Imielinski, and A. Swami, “Database mining: a performance perspective,” IEEE Trans. on Knowledge and Data Engineering, vol. 5, no. 6, pp. 914–925, 1993.

    [3] V. Chandola, V. Mithal, and V. Kumar, “Comparative evaluation of anomaly detection techniques for sequence data,” in Proceeding of Eighth IEEE Interna-
    tional Conference on Data Mining (ICDM ’08), 2008.

    [4] C.-C. C. W.-Y. L. Chien-Yi Chiu, Yuh-Jye Lee and H.-C. Huang, “Semi-supervised learning for false alarm reduction,” in ADVANCES IN DATA MINING. APPLICATIONS AND THEORETICAL ASPECTS, Lecture Notes in Computer
    Science, 2010, Volume 6171/2010, 595-605, 2010.

    [5] E. M. Clarke, O. Grumberg, and D. A. Peled, Model Checking. MIT Press,
    2000.

    [6] M. Colajanni, D. Gozzi, and M. Marchetti, “Collaborative architecture for malware detection and analysis,” in Proceedings of The Ifip Tc 11 23rd International Information Security Conference, vol. 278/2008. Springer Boston, 2008.

    [7] T. M. Cover and J. A. Thomas, Elements of Information Theory (2nd Ed.). Wiley-Interscience, July 2006.

    [8] T. F. Cox and M. A. A. Cox, Multidimensional Scaling, Second Edition. Chapman & Hall/CRC, 2000.

    [9] F. Cuppens and R. Ortalo, “Lambda: A language to model a database for detection of attacks,” in Proceedings of the Third International Workshop on Recent
    Advances in Intrusion Detection, January 2000, pp. 197–216.

    [10] M. K. R. D. Gao and D. X. Song, “Behavioral distance measurement using hidden markov models,” in Proceedings of the 9th International Symposium on Recent Advances in Intrusion Detection (RAID 2006), 2006.

    [11] O. Dain and R. Cunningham, “Fusing heterogeneous alert streams into scenarios,” in Proceedings of the 2001 ACM workshop on Data Mining for Security
    Applications, December 2001, pp. 1–12.

    [12] W. A. Debar, H., “Aggregation and correlation of intrusion detection alerts,” in Proceedings of the International Symposium on Recent Advances in Intrusion Detection, 2001.

    [13] R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological sequence analysis, probabilistic models of proteins and nucleic acids. Cambridge University Press, 1998.

    [14] M. Exbrayat, “ECML/PKDD challenge: analyzing web traffic a boundaries signature approach,” pp. 17–29, 2007.

    [15] G. Gu, P. Porras, V. yegneswaran, M. Fong, and W. Lee, “Bothunter: Detecting malware infection through ids-driven dialog correlation,” in Proceeding of 16th USENIX Security Symposium, August 2007, pp. 157–182.

    [16] M. O. H. S. Seung and H. Sompolinsky, “Query by committee,” in Proceedings of the fifth annual workshop on Computational learning theory. Pittsburgh,Pennsylvania, United States: ACM, 1992, pp. 287–294.

    [17] J. He and J. Carbonell, “Nearest-neighbor-based active learning for rare category detection,” in Proceeding of 21th Neural Information Processing System
    (NIPS07), 2007.

    [18] T. Holz, C. Gorecki, K. Rieck, and F. Freiling, “Measuring and detecting fastflux service networks,” in Proceedings of the 15th Network and Distributed System
    Security Symposium (NDSS), 2008.

    [19] S.-Y. Huang, C.-H. Mao, and H.-M. Lee, “Fast-flux service network detection based on spatial snapshot mechanism for delay-free detection,” in Proceeding of 5th ACM Symposyium on Information, Computer and Communications Security(ASIACCS 2010), 2010.

    [20] K. L. Ingham and H. Inoue, “Comparing anomaly detection techniques for http,”in In Proceedings of the 10th International Symposium on Recent Advances in Intrusion Detection (RAID 2007), 2007.

    [21] C. J. C. Burges, “A tutorial on support vector machines for pattern recognition,”Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121–167, 1998.

    [22] K. Julisch, “Clustering intrusion detection alarms to support root cause analysis,” ACM Trans. Inf. Sys. Secur., vol. 6, no. 4, pp. 443–471, November 2003.

    [23] J. J. P. K. Wang and S. J. Stolfo., “Anagram: a content anomaly detector resistant to mimicry attack,” in Proceedings of the 9th International Symposium on Recent Advances in Intrusion Detection (RAID 2006), 2006.

    [24] J. B. Kenneth L. Ingham, A. Somayaji and S. Forrest, “Learning dfa representations of http for protecting web applications,” The International Journal of Computer and Telecommunications Networking, vol. 51, pp. 1239–1255, 2007.

    [25] E. Keogh, S. Lonardi, and C. A. Ratanamahatana, “Towards parameter-free data mining,” in KDD ’04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. New York, NY, USA: ACM,
    2004, pp. 206–215.

    [26] R. Kohavi and F. Provost, “Glossary of terms,” Editorial for the Special Issue on Applications of Machine Learning and the Knowledge Discovery Process, vol. 30, pp. 271–274, 1998.

    [27] C. Kruegel and G. Vigna, “Anomaly detection of web-based attacks,” in CCS’03: Proceedings of the 10th ACM conference on Computer and communications
    security. New York, NY, USA: ACM, 2003, pp. 251–261.

    [28] C. Kruegel, G. Vigna, and W. Robertson, “A multi-model approach to the detection of web-based attacks,” Comput. Netw., vol. 48, no. 5, pp. 717–738, 2005.

    [29] J. Lafferty, A. Mccallum, and F. Pereira, “Conditional random fields: Probabilistic models for segmenting and labeling sequence data,” in Proc.18th International Conf. on Machine Learning. Morgan Kaufmann, San Francisco, CA, 2001, pp. 282–289. [Online]. Available: http://citeseer.ist.psu.edu/lafferty01conditional.html

    [30] M. H. C. Law, N. Z. 0002, and A. K. Jain, “Nonlinear manifold learning for data stream,” in SDM, 2004.

    [31] W. Lee and S. J. Stolfo, “Data mining approaches for intrusion detection,” in SSYM’98: Proceedings of the 7th conference on USENIX Security Symposium. Berkeley, CA, USA: USENIX Association, 1998, pp. 6–6.

    [32] W. Lee and D. Xiang, “Information-theoretic measures for anomaly detection,” vol. 0. Los Alamitos, CA, USA: IEEE Computer Society, 2001, p. 0130.

    [33] Y.-J. Lee and O. L. Mangasarian, “SSVM: A smooth support vector machine for classification,” Comput. Optim. Appl., vol. 20, no. 1, pp. 5–22, 2001.

    [34] D. D. Lewis and W. A. Gale, “A sequential algorithm for training text classifiers,”in SIGIR ’94: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval. New York, NY, USA: Springer-Verlag New York, Inc., 1994, pp. 3–12.

    [35] M. Li, J. H. Badger, X. Chen, S. Kwong, P. Kearney, and H. Zhang, “An information-based sequence distance and its application to whole mitochondrial
    genome phylogeny,” Bioinformatics, vol. 17, no. 2, pp. 149–154, 2001.

    [36] M. Li and P. Vit´anyi, An Introduction to Kolmogorov Complexity and Its Applications (2nd Ed.). New York: Springer, 1997.

    [37] J. E. LIKARISH, P. and I. JO, “Obfuscated malicious javascript detection using classification techniques,” in In Malware 2009, Montreal, 2009.

    [38] J. Lin, E. Keogh, S. Lonardi, and B. Chiu, “A symbolic representation of time series, with implications for streaming algorithms,” in In Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge
    Discovery. ACM Press, 2003, pp. 2–11.

    [39] S. Manganaris, M. Christensen, D. Zerkle, and K. Hermiz, “A data mining analysis of rtid alarms,” Comput. Networks, vol. 34, no. 4, pp. 571–577, October 2000.

    [40] H. Mannila, H. Toivonen, and A. I. Verkamo, “Discovering Frequent Episodes in Sequences,” in Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), U. M. Fayyad and R. Uthurusamy, Eds. Montreal, Canada: AAAI Press, 1995. [Online].Available: citeseer.ist.psu.edu/mannila95discovering.html

    [41] C.-H. Mao, H.-M. Lee, D. Parikh, T. Chen, and S.-Y. Huang, “Semi-supervised co-training and active learning based approach for multi-view intrusion detection,” in SAC ’09: Proceedings of the 2009 ACM symposium on Applied Computing. New York, NY, USA: ACM, 2009, pp. 2042–2048.

    [42] A. K. McCallum and K. Nigam, “Employing em and pool-based active learning for text classication,” in ICML98, 1998.

    [43] J. McHugh, “The 1998 lincoln laboratory ids evaluation,” in Proceedings of the Third International Workshop on Recent Advances in Intrusion Detection, October 2000, pp. 145–161.

    [44] P. Ning, Y. Cui, D. Reeves, and D. Xu., “Techniques and tools for analyzing intrusion alerts,” ACM Trans. Inf. Sys. Secur., vol. 7, no. 2, pp. 274–318, May 2004.

    [45] P. Ning and D. Xu, “Hypothesizing and reasoning about attacks missed by intrusion detection systems,” ACM Trans. Inf. Sys. Secur., vol. 7, no. 4, pp. 591–627, November 2004.

    [46] P. Ning, S. Jajodia, and X. S. Wang, “Abstraction-based intrusion detection in distributed environments,” ACM Trans. Inf. Syst. Secur., vol. 4, no. 4, pp. 407-452, 2001.

    [47] H.-K. Pao and J. Case, “Computing entropy for ortholog detection,” in International Conference on Computational Intelligence, 2004, pp. 89–92.

    [48] T. Pietraszek, “Using adaptive alert classification to reduce false positives in intrusion detection,” in Proceedings of 7th International Symposium, RAID 2004,
    Sophia Antipolis, France, September 15 - 17, 2004., 2004.

    [49] R. Pless, “Image spaces and video trajectories: Using isomap to explore video sequences,” in Ninth IEEE International Conference on Computer Vision, October 2003, pp. 1433–1440.

    [50] O. Sheyner, J. Haines, S. Jha, R. Lippmann, and J. M. Wing, “Automated generation and analysis of attack graphs,” in Proceedings of the 2002 IEEE Symposium on Security and Privacy, May 2002, pp. 273–284.

    [51] J. B. Tenenbaum, V. de Silva, and J. C. Langford, “A global geometric framework for nonlinear dimensionality reduction.” Science, vol. 290, no. 5500, pp. 2319–2323, December 2000.

    [52] A. Valdes and K. Skinner, “Probabilistic alert correlation,” in Lecture Notes in Computer Science, 2001, pp. 54–68.

    [53] F. Valeur, G. Vigna, C. Kruegel, and R. Kemmerer, “A comprehensive approach to intrusion detection alert correlation.” IEEE Transactions on Dependable and Secure Computing, vol. 1, no. 3, pp. 146–169, July 2004.

    [54] V. N. Vapnik, The Nature of Statistical Learning Theory. New York: Springer, 1999.

    [55] K. Wang and S. J. Stolfo, “Anomalous payload-based network intrusion detection,” Lecture Notes in Computer Science (Recent Advances in Intrusion Detection), vol. 3224/2004, pp. 203–222, 2005.

    [56] W. Wang and T. E. Daniels, “A graph based approach toward network forensics analysis,” ACM Trans. Inf. Syst. Secur., vol. 12, no. 1, pp. 1–33, 2008.

    [57] J. Wing, Information Assurance: Survivability and Security in Networked Systems. Morgan Kaufmann Publishers, Elsevier, Inc., 2008, ch. 9: Scenario Graphs Applied to Network Security, pp. 247–277.

    [58] S. S. Y. Song and A. Keromytis, “Spectrogram: A mixture-of-markov-chains model for anomaly detection in web trac,” in The proceeding of 16th Annual Network and Distributed System (NDSS), 2009.

    [59] S. Zhang, J. Li, X. Chen, and L. Fan, “Building network attack graph for alert causal correlation,” Computers & Security, vol. 27, no. 5-6, pp. 188–196, 2008.

    [60] J. Zhou, M. Heckman, B. Reynolds, A. Carlson, and M. Bishop, “Modeling network intrusion detection alerts for correlation,” ACM Trans. Inf. Sys. Secur., vol. 10, no. 1, pp. 1–31, February 2007.

    [61] B. Zhu and A. A. Ghorbani, “Alert correlation for extracting attack strategies,”International Journal of Network Security, vol. 3, no. 3, pp. 244–258, November 2006.

    QR CODE