簡易檢索 / 詳目顯示

研究生: 高黛威
Tai-wei Kao
論文名稱: 在漸進式資料中隱藏動態敏感關聯法則
Hiding dynamic sensitive association rules in incremental data
指導教授: 戴碧如
Bi-ru Dai
口試委員: 鮑興國
Hsing-kuo Pao
蔡曉萍
Hsiao-ping Tsai
戴志華
Chih-hua Tai
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2013
畢業學年度: 101
語文別: 英文
論文頁數: 53
中文關鍵詞: 關聯法則敏感資訊動態敏感關聯法則隱藏敏感關聯法則隱私保護漸進式環境
外文關鍵詞: Association rule, sensitive information, Dynamic sensitive association rule, Hide sensitive association rule, protect privacy, Incremental environment.
相關次數: 點閱:274下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著科技的進步與商業競爭激烈,隱私的議題比起以往也受到更多的重視和注意。在資料探勘中,擷取有意義的關聯法則是一個很重要的技術;相對的,在探勘過程中,這個技術也可能造成一些隱私問題。因此,為了要避免敏感資訊被揭露導致隱私問題,有許多研究開始將敏感的關聯法則隱藏。不過,科技與網路的快速發展,使資料持續不斷的增加,並且敏感關聯法則也可能會跟隨的時間與政策有所變化,在保護敏感關聯法則上形成挑戰。就我們所知,現存的研究中,隱藏敏感關聯式法則的技術中還不能有效的處理動態的資料與敏感法則。

    為了解決這些問題,我們提出一個在漸進式資料中保護動態敏感關聯法則的架構HSAi 和 HDSA。其中HSAi是在漸進的資料下保護敏感關聯法則,而為了隱藏敏感性法則,我們設計有效的策略去選擇適當的犧牲項目與交易進行刪除。而HDSA則是保護動態的敏感關聯法則演算法,動態包含增加與刪除兩部份,其中刪除敏感關聯法則,則是將已隱藏的關聯法則盡可能的再次出現於探勘結果中。這兩個演算法除了保護敏感關聯法則的目標,還希望可以使釋出的資料集能產生的副作用盡可能的小,盡可能的不影響非敏感關聯法則。而實驗結果也顯示,此架構可以釋出品質較高的資料集。


    As the advancement of technologies as well as the intense competition of business, the issues of privacy have acquiring more attention. Mining association rule is the significant technique in data mining. However, it may cause some privacy problem in mining processes. Many researches, thus, start to hide sensitive association rules due to avoid the sensitive information exposed. However, the development of computers and Internet technologies is so fast that data are increasing successively. In addition, sensitive association rules will change with time and policy. These both are the challenges for protecting sensitive association rules. Most exist technologies of hiding sensitive association rules cannot handle dynamic data and sensitive rules effectively.

    For solving these problems, this paper proposed a framework to protect dynamic sensitive association rules in incremental environment, HSAi and HDSA. HSAi is the algorithm to protect sensitive association rule in incremental data and we design the strategy to select appropriate victim transactions and items to delete them in order to hide sensitive association rules. HDSA is the algorithm for protecting dynamic sensitive rules, including adding and deleting. The mean of the deleting sensitive rule is the association rule that hidden can show again in the mining result. The goals of HSAi and HDSA are not only protecting sensitive rules but also producing least side effect from released dataset. Experiment results represent that the framework situation of incremental data and dynamic sensitive rules both can cause least side effects and maintain a desirable quality of sanitized database as well.

    指導教授推薦書 I 論文口試委員審定書 II 中文摘要 III ABSTRACT IV 誌謝 V LIST OF CONTENTS VI LIST OF TABLES VIII LIST OF EXAMPLES VIII LIST OF FIGURES IX 1. INTRODUCTION 1 1.1 BACKGROUND 1 1.2 MOTIVATION 1 1.3 CONTRIBUTION 2 1.4 THE ORGANIZATION OF THIS THESIS 3 2. RELATED WORK 4 3. PROBLEM FORMULATION AND BASIC DEFINITION 6 3.1 PROBLEM FORMULATION 6 3.2 PRELIMINARIES 7 4. HIDING SENSITIVE ASSOCIATION RULES IN INCREMENTAL DATA (HSAI) 10 4.1 SANITIZATION 10 4.1.1 Compute the deleting thresholds 11 4.1.2 Select the handled sensitive rule and victim item 13 4.1.3 Filter out the transactions 14 4.1.4 Select the victim transactions 17 4.2 INCREMENTAL DATA 17 4.2.1 Inverted Structure 18 5. HIDING DYNAMIC SENSITIVE ASSOCIATION RULES (HDSA) 20 5.1 STORING SENSITIVE RULES STRUCTURE 20 5.2 ADDING SENSITIVE ASSOCIATION RULES 22 5.3 DELETING SENSITIVE ASSOCIATION RULES 22 5.3.1 Computing the adding thresholds 22 5.3.2 Selecting recovered rules and items 23 5.3.3 Recovery 24 5.3.4 Re-hide 25 6. EXPERIMENTS 27 6.1 METRICS 27 6.2 DATASET 28 6.3 INCREMENTAL DATA 29 6.4 DYNAMIC SENSITIVE ASSOCIATION RULES 33 6.4.1 Adding sensitive association rules 34 6.4.2 Deleting sensitive association rules 37 6.4.3 Combination 40 6.5 EFFICIENCY 45 6.6.1 Incremental data 45 6.6.2 Dynamic sensitive rule 46 7. CONCLUSION 50 CONFERENCE 51 授權書 54

    [1] Jiawei Han and Micheline Kamber. Data Mining: Concepts and Techniques, 2nd edition, Morgan Kaufmann, 2006
    [2]Jiawei Han and Micheline Kamber. Data Mining: Concepts and Techniques, 2nd edition, Morgan Kaufmann, 2006
    [3]R. Agrawal, T. Imielinski, and A. N. Swami, “Mining Association Rules between Sets of Items in Large Databases,” in Proceedings of the 1993 ACM SIGMODInternational Conference on Management of Data (P. Buneman and S. Jajodia,eds.), vol. 22, (Washington, D.C.), pp. 207–216, ACM, May 1993.
    [4]J. A. Hartigan. Clustering Algorithms. John Wiley & Sons, 1975.
    [5]S. M. Weiss and C. A. Kulikowski. Computer Systems that Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert System. Morgan Kaufmann, 1991.
    [6]R. Agrawal, T. Imielinski, and A. N. Swami, “Mining Association Rules between
    Sets of Items in Large Databases,” in Proceedings of the 1993 ACM SIGMOD
    International Conference on Management of Data (P. Buneman and S. Jajodia,
    eds.), vol. 22, (Washington, D.C.), pp. 207–216, ACM, May 1993.
    [7]R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules in
    Large Databases,” in VLDB ’94: Proceedings of the 20th International Conference
    on Very Large Data Bases (J. B. Bocca, M. Jarke, and C. Zaniolo, eds.),(San Francisco, CA, USA), pp. 487–499, Morgan Kaufmann, 1994.
    [8]J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” in SIGMOD ’00: Proceedings of the 2000 ACM SIGMOD international conference on Management of data (W. Chen, J. Naughton, and P. A. Bernstein, eds.), vol. 29, (New York, NY, USA), pp. 1–12, ACM, May 2000.
    [9]C. Clifton and D. Marks, “Security and Privacy Implications of Data Mining,” in
    ACM SIGMOD Workshop on Data Mining and Knowledge Discovery, (Montreal,
    Canada), pp. 15–19, University of British Columbia Department of Computer
    Science, 1996.
    [10]S. R. M. Oliveira and O. R. Za‥ıane, “Privacy Preserving Frequent Itemset Mining,” in CRPIT ’14: Proceedings of the IEEE international conference on Privacy, security and data mining, (Darlinghurst, Australia, Australia), pp. 43–54, Australian Computer Society, Inc., 2002.
    [11]V. S. Verykios, A. K. Elmagarmid, E. Bertino, Y. Saygin, and E. Dasseni, “Association Rule Hiding,” IEEE Transactions on Knowledge and Data Engineering,
    vol. 16, pp. 434–447, April 2004.
    [12]Chih-Chia Weng; Shan-Tai Chen; Hung-Che Lo, "A Novel Algorithm for Completely Hiding Sensitive Association Rules," Intelligent Systems Design and Applications, 2008. ISDA '08. Eighth International Conference on , vol.3, no., pp.202-208, 26-28 Nov. 2008
    [13]Wang, S. 2009. Maintenance of sanitizing informative association rules. Expert Syst. Appl. 36, 2 (Mar. 2009), 4006-4012
    [14]Bi-Ru Dai and Li-Hsiang Chiang; “Hiding Frequent Patterns in the Updated Database.” Proc. of the 1stInternational Conference on Information Science and Applications (ICISA), Seoul, Korea, April 21-23, 2010, Page(s): 1 – 8. (EI)
    [15]Guanling Lee∗ and Yi Chun Chen, Protecting sensitive knowledge in association patterns mining, WIREs Data Mining Knowl Discov 2012, 2: 60–68 doi: 10.1002/widm
    [16]Komal Shah, Amit Thakkar, Amit Ganatra .Association Rule Hiding by Heuristic Approach to Reduce Side Effects & Hide Multiple R.H.S. Items International Journal of Computer Applications. (0975 – 8887) Volume 45– No.1, May 2012
    [17]Modi, C.N.; Rao, U.P.; Patel, D.R., “Maintaining privacy and data quality in privacy preserving association rule mining”, IEEE 2008 Seventh International Conference on Machine Learning and Applications, pp 1-6, 2010
    [18]Bi-Ru Dai ,Ya-Ping Kuo and Pai-Yu Lin, "Hiding Frequent Patterns under Multiple Sensitive Thresholds," Proceedings of the 19th International Conference on Database and Expert Systems Applications (DEXA 2008), Turin, Italy, September 1-5, 2008.
    [19]Shyue-Liang Wang, Rajeev Maskey , Ayat Jafari , Tzung-Pei Hong.Efficient sanitization of informative association rules +.Expert Systems with Applications 35 (2008) 442–450
    [20]A. Asuncion and D. Newman, “UCI machine learning repository [http://www.ics.uci.edu/∼mlearn/MLRepository.html],” 2007.
    [21]T. Brijs, G. Swinnen, K. Vanhoof, and G. Wets, “Using association rules for product assortment decisions: A case study,” in Knowledge Discovery and Data Mining, pp. 254–260, 1999.
    [22]Atallah M, Bertino E, Elmagarmid A, Ibrahim M, Verykios V. Disclosure limitation of sensitive rules. In: Proceedings of IEEE Workshop on Knowledge andData Engineering Exchange. Chicago, IL; 1999, 45
    [23]V. S. Verykios, A. K. Elmagarmid, E. Bertino, Y. Saygin, and E. Dasseni, “Asso-ciation Rule Hiding,” IEEE Transactions on Knowledge and Data Engineering,vol. 16, pp. 434–447, April 2004.
    [24]Wang SL, Jafari A. Hiding sensitive predictive association Rules. In: Proceedings of IEEE International Conference on Systems, Man, Cybernetics. Hawaii; 2005, 164–169.
    [25]Sun X, Yu PS. A border-based approach for hiding sensitive frequent itemsets. In: Proceedings of IEEE International Conference on Data Mining. Houston, TX; 2005, 426–433.
    [26]Divanis AG, Verykios VS. An integer programming approach for frequent itemset hiding. In: Proceedings of ACM International Conference on Information and Knowledge Management. New York, NY; 2006, 748–757.
    [27]Chen X, Oriowska M, Li X. A New framework of privacy preserving data sharing. In: Proceedings of IEEE ICDM Workshop on Privacy and Security Aspects of Data Mining. Los Alamitos, CA; 2004, 47–56.
    [28] El-Hajj M, Zaiane O (2003) Inverted matrix: efficient discovery of frequent items in large datasets in the context of interactive mining. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, Washington DC, pp 24–27
    [29] Elena Dasseni, Vassilios S. Verykios, Ahmed K.Elmagarmid, and Elisa Bertino, Hiding Association Rules by using Confidence and Support,In Proceedings of the 4th Information HidingWorkshop (2001), 369–383.
    [30] V. Verykios, E. Bertino, I. Fovino, L. Provenza, Y. Saygin, and Y. Theodoridis, “State-of-the-art in privacy preserving data mining,”ACM SIGMOD Record, vol. 33, no. 1, pp. 50–57, March 2004.
    [31] J. Vaidya, M. Zhu, and C. W. Clifton, Privacy Preserving Data Mining. Springer, 2006.
    [32] Sanjay keer, Prof. Anju Singh. “Hiding Sensitive Association Rule Using Clusters of Sensitive Association Rule” International Journal of Computer Science and Network (IJCSN) Volume 1, Issue 3, June 2012 www.ijcsn.org ISSN 2277-5420
    [33] Chun-Wei Lin, Tzung-Pei Hong, Chia-Ching Chang, and Shyue-Liang Wang. “A Greedy-based Approach for Hiding Sensitive Itemsets by Transaction Insertion” Journal of Information Hiding and Multimedia Signal Processing c2013 ISSN 2073-4212,Volume 4, Number 4, October 2013
    [34] Yogendra Kumar Jain, Vinod Kumar Yadav, Geetika S. Panday. “An Efficient Association Rule Hiding Algorithm for Privacy Preserving Data Mining” International Journal on Computer Science and Engineering (IJCSE) 2011
    [35] Wang, S.L., B. Parikh and A. Jafari.” Hiding informative association rule sets”. Exp. Syst. Appli., 33: pp. 316-323, 2007. M. Atallah, E.Bertino, A. Elmagarmid, M. Ibrahim, and V. S. Verykios. Disclosure limitation of sensitive rules, In Proc. of the 1999 IEEE Knowledge and Data Engineering Exchange Workshop, pp. 45–52, 1999.
    [36] Kasthuri S,Meyyappan T. ” HIDING SENSITIVE ASSOCIATION RULE USING HEURISTIC APPROACH” International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.1, January 2013

    無法下載圖示 全文公開日期 2018/07/26 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE