簡易檢索 / 詳目顯示

研究生: 林芳如
Fang-Ju Lin
論文名稱: 基於負面效應之資料發佈風險之評估方法
A Risk Evaluation Approach based on Negative Effects for Data Publishing
指導教授: 查士朝
Shi-Cho Cha
口試委員: 羅乃維
Nai-Wei Lo
陳曉慧
Hsia-Hui, Chen
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2015
畢業學年度: 103
語文別: 中文
論文頁數: 51
中文關鍵詞: 開放資料開放政府個人資料保護隱私匿名化去識別化
外文關鍵詞: Open data, Open government, Personal information protection, Privacy risk, Anonymization, De-identification
相關次數: 點閱:266下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 現今世界許多國家紛紛積極推動開放資料,一方面可促成政府資料更為透明化,一方面也可使資料作更有效之運用。企業也多方投入運用政府之公開資料做加值運用,創造新的商業模式或公民社會監督政府施政等。但開放資料中若有個人相關資料,則將有違反個人資料保護法及侵犯隱私之爭議。因此,在公開資料之前,對於所公開資料是否符合法律的規定,以及可能會對個人帶來的隱私及負面效應衝擊,應為政府公開資料前的重要考量。為降低公開資料對個人隱私侵犯之風險,現今已有許多匿名化及去識別化之技術,將公開資料中與個人相關之資料以泛化或雜訊等方式處理,降低資料的可識別性,進而降低公開資料的風險。英美等國為評估公開資料的風險,在公開前會做隱私衝擊分或風險評估,然即使是同一國家,不同的單位作法也不一致,未有一共同可遵循之風險評估方法,我國亦是如此。故本研究提出在政府公開資料前,應針對公開的資料集做風險評估,計算特定欄位組合的風險值,並以公開資料的負面效應及可識別機率做為影響風險值的因子,進而提出一風險評估模型,可將風險值以一量化數據表示,以供專家小組依據風險值高低判斷是否可被接受,若可被接受,則可直接公開,若不可被接受,則應做匿名化處理,以降低公開資料的風險,並於處理後再重新進行資料集的風險評估,直至風險可被接受為止。
    本研究提出之風險評估模型,為政府各單位評估公開資料風險,可採用之具體評估方式。


    Many countries around start to open their government collected data (or simply open data) recently. In general, opening data can improve governmental transparency and citizens' participation in governmental operation. In addition, interested parties can develop applications with interfaces tailored to their requirement based on the opened data. However, if a government agency releases data contained personal data, personal privacy may be invaded. Therefore, many countries have enacted laws to request government agencies to anonymize data based on privacy risks of data misuse. However, the countries usually do not specify how to evaluate privacy risks of opening a specified data set.
    In light of this, this thesis proposes a novel approach to evaluate privacy risk of a data set based on: (1) the negative impact if the data set is misused; (2) the probability that people can link a data record to a specified person. Therefore, government agencies can decide whether to further anonymize a data set based on evaluated privacy risks of the data set before publishing the data set. To sum up, the thesis can hopefully contribute to release the tension between privacy and utility requirement of opening data.

    摘要 i Abstract ii 誌謝 iii 圖目錄 vi 表目錄 vii 第一章 緒論 1 1.1 研究背景與動機 1 1.3論文架構 4 第二章 背景知識與文獻探討 6 2.1 資料開放之趨勢 6 2.2 匿名化技術 11 2.3各國對於資料匿名化或去識別化之作法與建議 17 2.4開放資料之風險評估方法 19 第三章 問題定義 24 第四章 資料發布風險評估模型與方法 28 4.1負面效應分析 28 4.2 資料集可識別機率 30 4.3風險評估模型 33 第五章 風險分析案例 37 5.1 警方受理民眾詐騙案件資料集 37 5.2 負面效應分析 39 5.3節 資料集可識別機率 39 5.4 風險評估結果 42 5.5討論 45 第六章 結論及未來方向 48 6.1 結論 48 6.2 未來研究方向 48 參考文獻 50

    1.Manyika, J., et al., Open data: Unlocking innovation and performance with liquid information. 2013, McKinsey Global Institute. p. 116.
    2.G8 Open Data Charter. 2013.
    3.胡瑋佳. 專家建言|政府基層IT人力不足,開放資料品質難改善. 臺灣大學圖書資訊學系助理教授楊東謀指出,中央應優先協助公務人員排除開放資料的個資觸法疑慮 2014; Available from: http://www.ithome.com.tw/news/89383.
    4.Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule, OCR, Editor. 2012. p. 7.
    5.Information technology-Security techniques-Information security risk management. 2011, ISO/IEC.
    6.What is Open? 2015; Available from: https://okfn.org/opendata/.
    7.General, t.M.o.S.f.t.C.O.a.P., Open Data White Paper: Unleashing the Potential. 2012: the Minister of State for the Cabinet Office and Paymaster General.
    8.Aggarwal, C.C., On k-anonymity and the curse of dimensionality, in Proceedings of the 31st international conference on Very large data bases. 2005, VLDB Endowment: Trondheim, Norway. p. 901-909.
    9.Lei, X., et al., Information Security in Big Data: Privacy and Data Mining. Access, IEEE, 2014. 2: p. 1149-1176.
    10.LeFevre, K., D.J. DeWitt, and R. Ramakrishnan, Incognito: efficient full-domain K-anonymity, in Proceedings of the 2005 ACM SIGMOD international conference on Management of data. 2005, ACM: Baltimore, Maryland. p. 49-60.
    11.Xiao, X. and Y. Tao, Anatomy: simple and effective privacy preservation, in Proceedings of the 32nd international conference on Very large data bases. 2006, VLDB Endowment: Seoul, Korea. p. 139-150.
    12.Fung, B.C.M., et al., Privacy-preserving data publishing: A survey of recent developments. ACM Comput. Surv., 2010. 42(4): p. 1-53.
    13.Sweeney, L., k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 2002. 10(5): p. 557-570.
    14.Machanavajjhala, A., et al., L-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data, 2007. 1(1): p. 3.
    15.Ninghui, L., L. Tiancheng, and S. Venkatasubramanian. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. in Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on. 2007.
    16.Nergiz, M.E. and C. Clifton, δ-presence-Presence without Complete World Knowledge. IEEE Trans. on Knowl. and Data Eng., 2010. 22(6): p. 868-883.
    17.Xiao, X. and Y. Tao, M-invariance: towards privacy preserving re-publication of dynamic datasets, in Proceedings of the 2007 ACM SIGMOD international conference on Management of data. 2007, ACM: Beijing, China. p. 689-700.
    18.McCallister, E., T. Grance, and K. Scarfone, Guide to Protecting the Confidentiality of Personally Identifiable Information(PII), N. National Institute of Standards and Technology, Editor. 2010. p. 59.
    19.PARTY, A.D.P.W., Opinion 05/2014 on Anonymisation Techniques. 2014. p. 37.
    20.Anonymisation:managing data protection risk code of practice. 2012, Information Commissioner's Office.
    21.Dalenius, T., Towards a methodology for statistical disclosure control. Statistik Tidskrift 15 1977: p. 429--444.
    22.BS 10012 Personal Information Management System. 2009.
    23.Conducting Privacy Impact Assessments code of practice. 2014: Information Commissioner's Office.
    24.PRIVACY POLICY GUIDANCE MEMORANDUM 2008, Hugo Teufel III, Chief Privacy Officer p. 4.
    25.Lakshmanan, L.V.S., R.T. Ng, and G. Ramesh, To do or not to do: the dilemma of disclosing anonymized data, in Proceedings of the 2005 ACM SIGMOD international conference on Management of data. 2005, ACM: Baltimore, Maryland. p. 61-72.

    QR CODE