簡易檢索 / 詳目顯示

研究生: 謝斐如
Fei-Ju Hsieh
論文名稱: 在加密雲數據下基於語意之多關鍵字查詢
Semantics-based Multi-Keyword Search over Encrypted Cloud Data
指導教授: 金台齡
Tai-Lin Chin
口試委員: 彭文志
Wen-Chih Peng
王丕中
Pi-Chung Wang
沈上翔
Shan-Hsiang Shen
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 50
中文關鍵詞: 雲端運算雲端安全可搜尋加密機制語意擴充查詢
外文關鍵詞: Cloud computing, Cloud security, Searchable encryption, Semantics-based search
相關次數: 點閱:298下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 雲端運算(cloud computing) 與儲存在近幾年逐漸普及,而雲端的數據儲存量也隨之增加,因此如何在加密環境下進行有效的關鍵字搜尋並與資料隱私保護結合顯
    然成為重要的議題。在相關的文獻中,大多數的方法都只提供單一關鍵字和多關
    鍵字的精準搜尋(exact search),其中使用者所提出的關鍵字必須與預先定義好的
    字典中的關鍵字完全符合才能進行搜尋,然而在現實生活的應用中,限制使用者
    只能使用在預設字典裡所提的關鍵字的搜尋方法是非常不切實際的。目前被提出
    的模糊查詢(fuzzy search) 方式,只注重在解析關鍵字結構以找出拼字錯誤而如何增強使用者下關鍵字的彈性並未被提及。本文提出了一在雲端環境下多關鍵字語
    意查詢,使用者所提出的關鍵字不再受限於預設好的字典關鍵字,而是更有彈性
    的提出自己所需的關鍵字,並可獲得與關鍵字最相關的文件。此外,在雲端伺服
    器進行搜尋運算的同時並考量資料的隱私。透過現實資料所完成的實驗結果明確
    地表現出本文所提出的搜尋方法可以有效率地達到在雲端環境下多關鍵字的語意
    擴充查詢。


    Cloud storages have gained popularity in the recent years. With the increasing quantity of data outsourced to cloud storages, keyword search over encrypted cloud data with the consideration of privacy preservation has become an important topic. The majority
    techniques in the literature only provide exact single or multiple keyword search in which the keywords have to exactly match those in a pre-defined dictionary. However, restricting users’keywords within the pre-defined dictionary is impractical for real-world applications. Some existing fuzzy keyword search schemes only focus on dealing with spelling mistakes of keywords. The flexibility of keywords used in the search is not considered.
    This paper addresses the problem of semantic multi-keyword search over encrypted cloud data. Users can use keywords not just in the pre-defined dictionary of the dataset, but with the flexibility of their own choice. The similarity of the given keywords with the search index of each document is then calculated. An adequate set of documents are selected as the results for the search based on the similarity. In addition, privacy of the search is also considered during the search executed by the third party service provider. Experiments are conducted using a dataset of massive papers in real world. The experimental analyses show that the proposed scheme can perform the semantic multi-keyword search
    effectively and efficiently.

    Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Searchable Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Single Keyword Searchable Encryption . . . . . . . . . . . . . . . . . . 5 2.3 Multi Keyword Searchable Encryption . . . . . . . . . . . . . . . . . . . 6 2.4 Fuzzy Keyword Searchable Encryption . . . . . . . . . . . . . . . . . . 6 2.5 Some related application of Word2Vec . . . . . . . . . . . . . . . . . . . 7 3 Problem Formulation and Proposed Method . . . . . . . . . . . . . . . . . . . 8 3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1.1 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2 Semantics-based Multi-Keyword Search over Encrypted Cloud Data (SMSE) 11 3.2.1 Document Index Generation . . . . . . . . . . . . . . . . . . . . 12 3.2.2 Semantic Search Mechanism . . . . . . . . . . . . . . . . . . . . 13 3.2.3 Evaluate word similarity - Word2Vec . . . . . . . . . . . . . . . 18 3.3 Enhanced Semantics-based Multi-Keyword Search over Encrypted Cloud Data (E-SMSE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3.1 Document Index Generation . . . . . . . . . . . . . . . . . . . . 21 3.3.2 Semantic Search Mechanism . . . . . . . . . . . . . . . . . . . . 22 4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.1 Dataset and Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . 26 4.2 Similarity analysis in SMSE . . . . . . . . . . . . . . . . . . . . . . . . 28 4.3 Precision analysis in E-SMSE . . . . . . . . . . . . . . . . . . . . . . . 29 4.4 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    [1] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector
    space,” Proceedings of International Conference on Learning Representations (ICLR), 2013.
    [2] E.-J. Goh, “Secure indexes.,” IACR Cryptology ePrint Archive, 2003.
    [3] R. Curtmola, J. Garay, S. Kamara, and R. Ostrovsky, “Searchable symmetric encryption: improved
    definitions and efficient constructions,” Journal of Computer Security, vol. 19, no. 5, pp. 895–934,
    2011.
    [4] N. Cao, C. Wang, M. Li, K. Ren, and W. Lou, “Privacy-preserving multi-keyword ranked search over
    encrypted cloud data,” IEEE Transactions on parallel and distributed systems, vol. 25, no. 1, pp. 222–
    233, 2014.
    [5] J. Li, Q. Wang, C. Wang, N. Cao, K. Ren, and W. Lou, “Fuzzy keyword search over encrypted data in
    cloud computing,” in Proceedings of IEEE INFOCOM, pp. 1–5, 2010.
    [6] B. Wang, S. Yu, W. Lou, and Y. T. Hou, “Privacy-preserving multi-keyword fuzzy search over encrypted
    data in the cloud,” in Proceedings of IEEE INFOCOM, pp. 2112–2120, 2014.
    [7] H. Li, D. Liu, Y. Dai, and T. H. Luan, “Engineering searchable encryption of mobile cloud networks:
    when qoe meets qop,” IEEE Wireless Communications, vol. 22, no. 4, pp. 74–80, 2015.
    [8] H. Li, Y. Yang, T. H. Luan, X. Liang, L. Zhou, and X. S. Shen, “Enabling fine-grained multi-keyword
    search supporting classified sub-dictionaries over encrypted cloud data,” IEEE Transactions on Dependable
    and Secure Computing, vol. 13, no. 3, pp. 312–325, 2016.
    [9] Z. Xia, Y. Zhu, X. Sun, and L. Chen, “Secure semantic expansion based search over encrypted cloud
    data supporting similarity ranking,” Journal of Cloud Computing, vol. 3, no. 1, p. 8, 2014.
    [10] D. Boneh and B. Waters, “Conjunctive, subset, and range queries on encrypted data,” in Theory of
    Cryptography Conference, pp. 535–554, Springer, 2007.
    [11] D. X. Song, D. Wagner, and A. Perrig, “Practical techniques for searches on encrypted data,” in IEEE
    Symposium on Security and Privacy, pp. 44–55, 2000.
    [12] C. Wang, N. Cao, J. Li, K. Ren, and W. Lou, “Secure ranked keyword search over encrypted
    cloud data,” in Proceedings of International Conference on Distributed Computing Systems (ICDCS),
    pp. 253–262, 2010.
    [13] D. Boneh, G. Di Crescenzo, R. Ostrovsky, and G. Persiano, “Public key encryption with keyword
    search,” in International Conference on the Theory and Applications of Cryptographic Techniques,
    pp. 506–522, Springer, 2004.
    [14] W. K. Wong, D. W.-l. Cheung, B. Kao, and N. Mamoulis, “Secure knn computation on encrypted
    databases,” in Proceedings of the 35th SIGMOD international conference on Management of data,
    pp. 139–152, ACM, 2009.
    [15] W. Sun, B. Wang, N. Cao, M. Li, W. Lou, Y. T. Hou, and H. Li, “Privacy-preserving multi-keyword
    text search in the cloud supporting similarity-based ranking,” in Proceedings of the 8th ACM SIGSAC
    symposium on Information, computer and communications security, pp. 71–82, ACM, 2013.
    [16] M. Kuzu, M. S. Islam, and M. Kantarcioglu, “Efficient similarity search over encrypted data,” in IEEE
    28th International Conference on Data Engineering (ICDE), pp. 1156–1167, 2012.
    [17] X. Sun, Y. Zhu, Z. Xia, and L. Chen, “Privacy-preserving keyword-based semantic search over encrypted
    cloud data,” International journal of Security and its Applications, vol. 8, no. 3, pp. 9–20,
    2014.
    [18] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of
    words and phrases and their compositionality,” in Advances in neural information processing systems,
    pp. 3111–3119, 2013.
    [19] B. Xue, C. Fu, and Z. Shaobin, “A study on sentiment computing and classification of sina weibo with
    word2vec,” in IEEE International Congress on Big Data (BigData Congress), pp. 358–363, IEEE,
    2014.
    [20] Z. Su, H. Xu, D. Zhang, and Y. Xu, “Chinese sentiment classification using a neural network tool
    —word2vec,” in International Conference on Multisensor Fusion and Information Integration for
    Intelligent Systems (MFI), pp. 1–6, IEEE, 2014.
    [21] D. Rahmawati and M. L. Khodra, “Word2vec semantic representation in multilabel classification for
    indonesian news article,” in 2016 International Conference On Advanced Informatics: Concepts, Theory
    And Application (ICAICTA), pp. 1–6, IEEE, 2016.
    [22] P.-C. Wen, Y.-L. Tsai, and R. T.-H. Tsai, “基於word2vec 詞向量的網路情緒文和流行音樂媒合
    方法之研究(matching internet mood essays with pop-music based on word2vec)[in chinese],” The
    2015 Conference on Computational Linguistics and Speech Processing (ROCLING 2015), p. 167.
    [23] S. Zerr, E. Demidova, D. Olmedilla, W. Nejdl, M. Winslett, and S. Mitra, “Zerber: r-confidential
    indexing for distributed documents,” in Proceedings of the 11th international conference on Extending
    database technology: Advances in database technology (EDBT), pp. 287–298, ACM, 2008.
    [24] G. Fu, C. B. Jones, and A. I. Abdelmoty, “Ontology-based spatial query expansion in information
    retrieval,” in OTM Confederated International Conferences, pp. 1466–1482, Springer, 2005.
    [25] “Pubmed.” https://www.ncbi.nlm.nih.gov/pubmed.
    [26] “gensim: models.word2vec - deep learning with word2vec.” https://radimrehurek.com/
    gensim/models/word2vec.html.

    QR CODE