研究生: |
黃婉綺 Wan-Qi Huang |
---|---|
論文名稱: |
跨組織聯合資料分享之資料隱私保護方法:系統文獻方法 Data Privacy Protection Schemes for Cross Organizational Data sharing :A Systematic Literature Review |
指導教授: |
查士朝
Shi-Cho Cha |
口試委員: |
羅乃維
Nai-Wei Lo 黃政嘉 Jheng-Jia Huang |
學位類別: |
碩士 Master |
系所名稱: |
管理學院 - 資訊管理系 Department of Information Management |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 中文 |
論文頁數: | 61 |
中文關鍵詞: | 資料分享 、隱私保護 、差分隱私 、聯邦學習 、秘密共享 |
外文關鍵詞: | data sharing, privacy, differential privacy, Federated Learinng, Secret-Sharing |
相關次數: | 點閱:267 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著大數據與人工智慧技術的發展,各企業或政府單位紛紛收集了許多的資料。這些資料如果能夠共享,則可以減少較不具經濟資源的個人或組織收集資料的困難,乃至於可以讓一個組織因為取得使用者各方面的資料而更了解該使用者。然而在資料公開透明化的結果下,恐將民眾的個人隱私公布於大眾之下,或通過各種分享資料的結合可窺探到一些機密等問題,為保障分享的資料不被有心人士利用。研究學者們針對資料的隱私問題,一直不斷的精進研究各種可以分享並保護資料隱私的方法,並隨著跨組織或公私合作的資料交換需求越多,資料的隱私性訴求也越大,重要性也越重。
本研究統整2015年至2020年內相關跨組織聯合資料分享之資料隱私保護方法,並歸納出多種隱私保護方法,主要以匿名化、差分隱私、加密、聯邦學習及秘密共享,並針對這些方法進行分析。主要分析各種方法可能尚存的風險,並判斷隱私保護的程度,其較多用於何種形式,能否保留資料的原始性。通過本研究的分析去看見,當前最適用的隱私方法及各種隱私方法的進步空間與未來的發展趨勢。
Currently, organizations usually collect user data to understand their users via big data or AI technologies. If an organization can obtain a user's data from other organizations, the organization can more understand the user. Therefore, more and more organizations are urged to share their data or to participate a alliance to share data to alliance member,. However data transparency comes with the consequences of the fear of personal details being exposed publicly, or private files being compromised .To ensure shared information isn’t being capitalized by the wrong people, experts have been working on ways to improve and protect information privacy, as demand for information exchange increases, so does the importance of privacy needs.
This research has gathered and analyzed data on cross organization data sharing from the year 2015 to 2020, concluding multiple ways for security, prioritizing in anonymous, differential privacy, encryption, federated learning and secret sharing. The analysis is mainly focused on potential risks that maybe left out from the methods above, assessing the degree of privacy protection from each one, to see what methods are being practiced and if they can retain originality from the data. Through this research we can find the most optimal information security methods, in additionally, finding room for improvement and forecasting future trends.
[1] A. Afrin, M. K. Paul and A. H. M. S. Sattar, (2019, December). Privacy preserving data mining using non-negative matrix factorization and singular value decomposition. In 2019 4th International Conference on Electrical Information and Communication Technology (EICT) (pp. 1-6). IEEE.
[2] A.Majeed,. (2019). Attribute-centric anonymization scheme for improving user privacy and utility of publishing e-health data. Journal of King Saud University-Computer and Information Sciences, 31(4), 426-435.
[3] A.Nilsson, S. Smith, G. Ulm, E. Gustavsson,& M. Jirstrand ,(2018, December). A performance evaluation of federated learning algorithms. In Proceedings of the Second Workshop on Distributed Infrastructures for Deep Learning (pp. 1-8).
[4] B. Pfitzner, N. Steckhan ,& B. Arnrich, (2021). Federated Learning in a Medical Context: A Systematic Literature Review. ACM Transactions on Internet Technology (TOIT), 21(2), 1-31.doi: 10.1145/3412357.
[5] B. Suzic , & A. Reiter,(2016, August). Towards secure collaboration in federated cloud environments. In 2016 11th International Conference on Availability, Reliability and Security (ARES) (pp. 750-759). IEEE.
[6] B.Ermiş, & A.T. Cemgil, (2020). Data sharing via differentially private coupled matrix factorization. ACM Transactions on Knowledge Discovery from Data (TKDD), 14(3), 1-27.
[7] C .Piao, Y. Shi, J. Yan, C . Zhang, & L. Liu. (2019). Privacy-preserving governmental data publishing: A fog-computing-based differential privacy approach. Future Generation Computer Systems, 90, 158-174. ISSN 0167-739X, doi: 10.1016/j.future.2018.07.038.
[8] D. R. Harris,. (2020, December). Leveraging Differential Privacy in Geospatial Analyses of Standardized Healthcare Data. In 2020 IEEE International Conference on Big Data (Big Data) (pp. 3119-3122). doi: 10.1109/BigData50022.2020.9378390.
[9] F.Hörandner, S. Krenn, A.Migliavacca, F.Thiemer, & B.Zwattendorfer, (2016, August). CREDENTIAL: a framework for privacy-preserving cloud-based data sharing. In 2016 11th International Conference on Availability, Reliability and Security (ARES) (pp. 742-749).doi: 10.1109/ARES.2016.79.
[10] Fioretto, F., & Van Hentenryck, P. (2019, May). Privacy-Preserving Federated Data Sharing. In AAMAS (pp. 638-646).
[11] G.Beig, A. Mosallanezhad, R.Guo, H.Alvari, A.Nou, & H. Liu, (2020, January). Privacy-aware recommendation with private-attribute protection using adversarial learning. In Proceedings of the 13th International Conference on Web Search and Data Mining (pp. 34-42).doi: 10.1145/3336191.3371832
[12] H. Wimmer, V. Y. Yoon ,& V. Sugumaran, (2016). A multi-agent system to support evidence based medicine and clinical decision making via data sharing and data privacy. Decision Support Systems, 88, 51-66.doi: 10.1016/j.dss.2016.05.008.
[13] H.Diddee, &B. Kansra, (2020, September). CrossPriv: user privacy preservation model for cross-silo federated software. In 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE) (pp. 1370-1372).
[14] H.Li,Y. Yang, Y.Dou, J. Park, J. M, & K. Ren, (2019, April). PeDSS: Privacy enhanced and database-driven dynamic spectrum sharing. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications (pp. 1477-1485).doi: 10.1109/INFOCOM.2019.8737630.
[15] J. Shao, R. Lu , & X. Lin,(2015, April). Fine-grained data sharing in cloud computing for mobile devices. In 2015 IEEE Conference on Computer Communications (INFOCOM) (pp. 2677-2685). doi: 10.1109/INFOCOM.2015.7218659.
[16] J.Li, J.J. Yang , Y.Zhao, B.Liu, M.Zhou, J.Bi, & Q.Wang, (2016). Enforcing differential privacy for shared collaborative filtering. IEEE Access, 5, 35-49.
[17] K. Agrawal and V. Tewari (2017, October). Analysis of privacy preserving echanisms for outsourced data mining. In 2017 International Conference on Recent Innovations in Signal processing and Embedded Systems (RISE) (pp. 572-576). IEEE.doi: 10.1109/RISE.2017.8378220.
[18] K.Bonawitz, V.Ivanov, B.Kreuter,A. Marcedone, H. B. McMahan, S. Patel, ... & K.Seth ,(2017, October). Practical secure aggregation for privacy-preserving machine learning. In proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (pp. 1175-1191).doi: 10.1145/3133956.3133982.
[19] L. Li, R. Lu, K. R. Choo, A. Datta & J. Shao, (2016). Privacy-preserving-outsourced association rule mining on vertically partitioned databases. IEEE Transactions on Information Forensics and Security, 11(8), 1847-1861.doi: 10.1109/TIFS.2016.2561241.
[20] L. Liu, C. Piao, X. Jiang , & L. Zheng (2018, October). Research on governmental data sharing based on local differential privacy approach. In 2018 IEEE 15th International Conference on e-Business Engineering (ICEBE) (pp. 39-45). doi: 10.1109/ICEBE.2018.00017.
[21] Li, L., Fan, Y., Tse, M., & Lin, K. Y. (2020). A review of applications in federated learning. Computers & Industrial Engineering, 149,106854.doi: 10.1016/j.cie.2020.106854.
[22] M. A. P.Chamikara, P.Bertók, I.Khalil, D.Liu, & S.Camtepe, (2021). PPaaS: Privacy Preservation as a Service. Computer Communications, 173, 192-205. doi: 10.1016/j.comcom.2021.04.006.
[23] M. Azarm, C.l Backman and C. Kuziemsky(2019, May). System level patient-centered data sharing. In 2019 IEEE/ACM 1st International Workshop on Software Engineering for Healthcare (SEH) (pp. 45-48). doi: 10.1109/SEH.2019.00015.
[24] M.A.P. Chamikara, P. Bertok, I. Khalil, D. Liu , & S. Camtepe, (2021). Privacy preserving distributed machine learning with federated learning. Computer Communications, 171, 112-125.doi: 10.1016/j.comcom.2021.02.014
[25] N. H.Domadiya, &U. P. Rao, (2018, December). Privacy Preserving Approach for Association Rule Mining in Horizontally Partitioned Data using MFI and Shamir’s Secret Sharing. In 2018 IEEE 13th International Conference on Industrial and Information Systems (ICIIS) (pp. 217-222). doi: 10.1109/ICIINFS.2018.8721388.
[26] N. Truong, K. Sun, S. Wang, F. Guitton , & Y. Guo,(2021). Privacy preservation in federated learning: An insightful survey from the GDPR perspective. Computers & Security, 110, 102402.
[27] P. S.Wang, F. Lai, H.C.Hsiao, & J.L.Wu (2016). Insider collusion attack on privacy-preserving kernel-based data mining systems. IEEE Access, 4, 2244-2255.doi: 10.1109/ACCESS.2016.2561019.
[28] P. Sui ,& X. Li, (2017). A privacy-preserving approach for multimodal transaction data integrated analysis. Neurocomputing, 253, 56-64.doi: 10.1016/j.neucom.2016.09.130.
[29] P.Colombo, E.Ferrari, & E. D.Tümer, (2021). Regulating data sharing across MQTT environments. Journal of Network and Computer Applications, 174, 102907.
[30] R. S. Mohammed, E. M. Hussien , & J. R. Mutte,(2016, May). A novel technique of privacy preserving association rule mining. In 2016 Al-Sadeq International Conference on Multidisciplinary in IT and Communication Science and Applications (AIC-MITCSA) (pp. 1-6). doi: 10.1109/AIC-MITCSA.2016.7759930.
[31] S. A. Abdelhameed, S. M. Moussa and M. E. Khalifa,(2017, December). Enhanced additive noise approach for privacy-preserving tabular data publishing. In 2017 Eighth International Conference on Intelligent Computing and Information Systems (ICICIS) (pp. 284-291).doi: 10.1109/INTELCIS.2017.8260076.
[32] S. H.Lim,H. Sim, R.Gunasekaran, & S.S.Vazhkudai, (2017, November). Scientific user behavior and data-sharing trends in a petascale file system. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (pp. 1-12).doi: 10.1109/PhDEDITS47523.2019.8986965.
[33] S. Li, N. Mu, J. Le, & X. Liao,(2019). A novel algorithm for privacy preserving utility mining based on integer linear programming. Engineering Applications of Artificial Intelligence, 81, 300-312.doi: 10.1016/j.engappai.2018.12.006
[34] S. Liu, Q. Qu, L. Chen , & L. M. Ni, (2015). SMC: A practical schema for privacy-preserved data sharing over distributed data streams. IEEE Transactions on Big Data, 1(2), 68-81.doi: 10.1109/TBDATA.2015.2498156.
[35] S. Mehnaz, G. Bellala, & E. Bertino, (2017, June). A secure sum protocol and its application to privacy-preserving multi-party analytics. In Proceedings of the 22nd ACM on Symposium on Access Control Models and Technologies (pp. 219-230).doi: 10.1145/3078861.3078869.
[36] S. Revathi ,& M. Suriakala, (2018, February). An Intelligent and Novel Algorithm for Securing Vulnerable Users of Online Social Network. In 2018 Second International Conference on Computing Methodologies and Communication (ICCMC) (pp. 214-219). doi: 10.1109/ICCMC.2018.8487760.
[37] S. Sharma and A. S.Rajawat, (2016, March). A review of privacy preserving models for multi-party data release framework. In Proceedings of the ACM Symposium on Women in Research 2016 (pp. 165-168).doi: 10.1145/2909067.2909098.
[38] S.Chen,A. Fu, J.Shen, S.Yu, , H.Wang, & H.Sun,(2020). RNN-DP: A new differential privacy scheme base on Recurrent Neural Network for Dynamic trajectory privacy protection. Journal of Network and Computer Applications, 168, 102736.doi: 10.1016/j.jnca.2020.102736.
[39] S.H. Lim, H .Sim, R .Gunasekaran & S. S. Vazhkudai,(2017, November). Scientific user behavior and data-sharing trends in a petascale file system. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (pp. 1-12).doi: 10.1145/3126908.3126924.
[40] V. Baby and N. S. Chandra (2016, September). Distributed threshold k-means clustering for privacy preserving data mining. In 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (pp. 2286-2289).doi: 10.1109/ICACCI.2016.7732393.
[41] V. K. Marimuthu , & C. Lakshmi, (2021, February). Performance analysis of privacy preserving distributed data mining based on cryptographic techniques. In 2021 7th International Conference on Electrical Energy Systems (ICEES) (pp. 635-640).doi: 10.1109/ICEES51510.2021.9383673.
[42] X. Liang, J. Zhao, S. Shetty, J. Liu , & D. Li, (2017, October). Integrating blockchain for data sharing and collaboration in mobile healthcare applications. In 2017 IEEE 28th annual international symposium on personal, indoor, and mobile radio communications (PIMRC) pp. 1-5.doi: 10.1109/PIMRC.2017.8292361.
[43] Y. Pu, C. Hu, S. Deng ,& A. Alrawais,(2020). R²PEDS: a recoverable and revocable privacy-preserving edge data sharing scheme. IEEE Internet of Things Journal, 7(9), 8077-8089.doi: 10.1109/JIOT.2020.2997389.
[44] Y. Zhou, Y. Tian, F. Liu, J. Liu , & Y. Zhu (2020). Privacy preserving distributed data mining based on secure multi-party computation. Computer Communications, 153, 208-216. doi: 10.1109/ICAIT.2019.8935900.
[45] Y.Wu, S. Cai, X.Xiao, G.Chen, & B.C.Ooi, (2020). Privacy preserving vertical federated learning for tree-based models. VLDB Endowment.doi: 10.14778/3407790.3407811.
[46] Y.Yang,X. Zheng,W. Guo, Liu, X., & V.Chang,. (2019). Privacy-preserving smart IoT-based healthcare big data storage and self-adaptive access control system. Information Sciences, 479, 567-592.doi: 10.1016/j.ins.2018.02.005.
[47] Z.Chai, A. Ali, S.Zawad, S.Truex, A.Anwar, N.Baracaldo, …& Y.Cheng, (2020, June). Tifl: A tier-based federated learning system. In Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing (pp. 125-136).doi: 10.1145/3369583.3392686.
[48] Z.Chuanxin, S.Yi, & W.Degang, (2020, October). Federated learning with Gaussian differential privacy. In Proceedings of the 2020 2nd International Conference on Robotics, Intelligent Control and Artificial Intelligence (pp. 296-301).oi: 10.1145/3438872.3439097.
其他文獻
[49] The secretariat is provided by Directorate C (Fundamental Rights and Union Citizenship) of the European Commission, Directorate General Justice, B-1049 Brussels, Belgium, Office No MO-59 02/013.
(http://ec.europa.eu/justice/data-protection/index_en.htm)
[50] Dwork C. (2006) Differential Privacy. ICALP 2006. Lecture Notes in Computer Science, vol 4052. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11787006_1
[51] Intersoft consulting services AG Website,GDPR Encryption. Retrieved from https://gdpr-info.eu/issues/encryption/ (August 2, 2021)
[52] Adi Shamir, "How to share a secret," Communications of the ACM, 1979, Vol. 22, Issue 11, pp. 612-613, doi: 10.1145/359168.359176.
[53] The General Data Protection Regulation (GDPR). Regulation (EU) 2016/679 (May 2016). Retrieved from https://ec.europa.eu/info/law/law-topic/data-protection/data-protection-eu_en (August 2, 2021)
[54] International Organization for Standardization Website,ISO 25237:2017 Health informatics - Pseudonymization (January 2017). Retrieved from https://www.iso.org/standard/63553.html (August 2,2021)
[55] ISAO Standards Organization Website,ISAO 300-1: Introduction-to-Information-Sharing (October 14, 2016). Retrieved from https://www.isao.org/products/isao-300-1-introduction-to-information-sharing/ (August 2, 2021)
[56] Cloud computing ,https://en.wikipedia.org/wiki/Cloud_computing((August 2, 2021))
[57] 電子商務,
https://zh.wikipedia.org/wiki/%E7%94%B5%E5%AD%90%E5%95%86%E5%8A%A1((August2, 2021))
[58] The European Union Agency for Cybersecurity (ENISA), <Information Sharing and Analysis Center (ISACs) - Cooperative models>
(https://www.enisa.europa.eu/publications/information-sharing-and-analysis-center-isacs-cooperative-models)
[59] McCarthy, C., Harnett, K., Carter, A., & Hatipoglu, C. (2014, October).Assessment of the information sharing and analysis center model. (Report No. DOT HS 812 076). Washington, DC: National Highway Traffic Safety Administration.
[60] A.Booth, A.Sutton,and D.Papaioannou. (2016). Systematic Approaches to a Successful Literature Review (2th ed.). London: SAGE Publications Ltd. ISBN 978-1-4739-1245-8.
[61] Science & Technology Law Institute Website,Introduction to Singapore's Legal Environment Framework for Data Sharing(December 31, 2019). Retrieved from https://stli.iii.org.tw/article-detail.aspx?no=64&tp=1&d=8430#_ftnref6 (August 2, 2021)
[62] R.Mendes & J.P.Vilela, "Privacy-Preserving Data Mining: Methods, Metrics, and Application," in IEEE Access, vol. 5, pp. 10562-10582, 2017, doi: 10.1109/ACCESS.2017.2706947.