簡易檢索 / 詳目顯示

研究生: Vania Utami
Vania - Utami
論文名稱: A Frequent Itemset Mining Algorithm based on Interval Intersection Operation
A Frequent Itemset Mining Algorithm based on Interval Intersection Operation
指導教授: 呂永和
Yung-Ho Leu
口試委員: 楊維寧
Wei-Ning Yang
陳雲岫
Chen, Yun-Shiow
呂永和
Yung-Ho Leu
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2015
畢業學年度: 103
語文別: 英文
論文頁數: 24
外文關鍵詞: Frequest Itemset Mining, Bitmap Operations, Interval Operations
相關次數: 點閱:198下載:28
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Frequent Itemset Mining (FIM) is a crucial technique for data mining. This thesis proposes a new FIM algorithm based on interval and bitmap operations. For each itemset in a dataset, an interval set is used to represent the transactions that contain this itemset. Interval intersection operations are then used to find the support counts of the itemset. The experimental results showed that this algorithm takes less execution times than the bit table and Apriori TID algorithms for several configurations with different minimum support thresholds, numbers of transactions, and average lengths of transactions.


    Frequent Itemset Mining (FIM) is a crucial technique for data mining. This thesis proposes a new FIM algorithm based on interval and bitmap operations. For each itemset in a dataset, an interval set is used to represent the transactions that contain this itemset. Interval intersection operations are then used to find the support counts of the itemset. The experimental results showed that this algorithm takes less execution times than the bit table and Apriori TID algorithms for several configurations with different minimum support thresholds, numbers of transactions, and average lengths of transactions.

    ABSTRACT i ACKNOWLEDGEMENT ii TABLE OF CONTENTS iii LIST OF FIGURES iv LIST OF TABLES v Chapter 1. Introduction 6 Chapter 2. Problem Definition 8 2.1. Frequent Itemset Mining 8 2.2. An Interval 8 Chapter 3. Frequent Itemset Mining based on the Interval Intersection Operation 10 3.1. The Apriori Algorithm 10 3.2. Candidate Generation 11 3.3. Support Counting 15 3.4. Interval intersection Operation 16 Chapter 4. Experimental Result 21 4.1. Experimental Design 21 4.2. Performance Comparisons 22 4.2.1 Performance comparisons for datasets with different support counts 22 4.2.2 Performance comparisons with different number of transactions 26 4.2.3 Performance comparisons on datasets with different transaction sizes 26 Chapter 5. Conclusions and Future Research 28 5.1. Result Summary 28 5.2. Limitation 28 5.3. Future Research 28 REFERENCES xxx

    [1] M. J. Zaki, "Scalable algorithms for association mining," Knowledge and Data Engineering, IEEE Transactions , pp. 372-390, 2000.
    [2] S. Agrawal, V. Krishnan and Haritsa, "On addressing efficiency concerns in privacy-preserving mining," Database Systems for Advanced Applications Lecture Notes in Computer Science , vol. 2973, pp. 113-124, 2004.
    [3] S.-Y. Wur and Y. Leu, "The Boolean-based Algorithms for Mining Association Rules in Large Databases," The 6th International Conference on Database Systems for Advanced Applications, pp. 179-186, 1999...
    [4] R. Agrawal, T. Imieliński and A. Swami, "Mining association rules between sets of items in large databases," ACM SIGMOD Record, no. 2, p. 22, 1993.
    [5] Moore, R. E, R. Baker Kearfott and M. J. Cloud, Introduction to interval analysis, Siam, 2009.
    [6] Cormen and T. H, Introduction to algorithms., Cambridge: MIT press, 2001.
    [7] Layer and R. M., "Binary Interval Search: a scalable algorithm for counting interval intersections," Bioinformatics, vol. 29, no. 1, pp. 1-7, 2013.
    [8] Dong, Jie and M. Han, "BitTableFI: An efficient mining frequent itemsets algorithm," Knowledge-Based Systems , vol. 20, no. 4, pp. 329-335, 2007.
    [9] "IBM Data Mining Web Site," [Online]. Available: http://www.almaden.ibm.cs/quest/index.html.
    [10] C. Borgelt, "Christian Borgelt's Web Pages," [Online]. Available: http://www.borgelt.net/fpgrowth.html.
    [11] Woon, Yew-Kwong and Wee-Keong, "A support-ordered trie for fast frequent itemset discovery," knowledge and Data Engineering, IEEE Transactions, vol. 16, no. 7, pp. 875-879, 2004.

    QR CODE