簡易檢索 / 詳目顯示

研究生: 張少東
Shao-Tung Chang
論文名稱: 網路伺服器上匿名使用者之分群與推測
Clustering and Inference of Anonymous User Visits to Web Servers
指導教授: 陳秋華
Chyou-Hwa Chen
口試委員: 金台齡
Tai-Ling Chin
鄭欣明
Shin-Ming Cheng
邱舉明
Ge-Ming Chiu
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2015
畢業學年度: 103
語文別: 中文
論文頁數: 71
中文關鍵詞: 瀏覽器指紋使用者追蹤分群分類
外文關鍵詞: browser fingerprint, user tracking, clustering, classification
相關次數: 點閱:234下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 個人隱私權的意識抬頭,原本使用者分析的追蹤技術是用Cookie儲存一個識別資料在使用者的裝置上,下次再來訪時就可以利用Cookie 中的識別資料判別身分,但此種方法有威脅到隱私權的範疇,歐盟警告應該要限制資料儲存在使用者的裝置上,因此,衍生出了匿名使用者追蹤技術。
    匿名使用者追蹤主要利用人們使用瀏覽器與裝置後產生的差異,透過使用者造訪網站伺服器時可以取得數個特徵值,去區分哪些存取紀錄(Fingerprint)是屬於一個人,在匿名的訪客不會提供其私密的資料的情況下,可作為區分匿名使用者之依據。匿名訪客造訪網站時,伺服器可以取得的特徵值通常包含使用者之瀏覽器(User Agent),時區(Timezone),外掛程式(Plugins),字型(Font)…等。利用這些特徵對匿名使用者造訪網站訪客的區分技術,在國際文獻上已有許多研究及討論,而因為可以取得的特徵值有些會經常變,我們主要是要找出可靠度高與變動性低之特徵,取得最佳特徵選擇,且利用前資料處理(Preprocess)去提高準確率,建立分群演算法讓訓練集合有良好群聚效果,進一步分類新的Fingerprints。


    Rising awareness of personal privacy in recent year, originally, the user analysis usually used Cookie to track user and store an identifying information on the user's device. When the next time the same user visits our website, we can use the identifying information to distinguish who it is .
    However, EU legislators warned that restrict the ability to store information on client's devices in 2009 , therefore, it derived to the anonymous user tracking.
    Anonymous user tracking mainly use the personally different feature from browser and device to identify user. Under the limit of stored information on device, this way can implement user tracking.
    The features ,such as the User Agent, time zone , plug-ins, fonts... and so on, can use machine learning method to solve problem. According to some researches about browser fingerprint, the features would change periodically. Our target is to find high- reliable and low-variable features, and use preprocess to increase the accuracy, cluster dataset and classify new fingerprints.

    誌謝................................................................................................................................. I 摘要................................................................................................................................ II Abstract ......................................................................................................................... III 目錄...............................................................................................................................IV 圖目錄...........................................................................................................................5 表目錄............................................................................................................................ 6 1. 緒論........................................................................................................................... 1 2. 相關研究................................................................................................................... 2 3. 收集特徵架構........................................................................................................... 4 4. 特徵選擇................................................................................................................... 6 4.1 User Agent .................................................................................................... 6 4.2 Plug-in ........................................................................................................... 8 4.3 Accept Language .......................................................................................... 9 4.4 Browser window size .................................................................................. 10 4.5 Screen size .................................................................................................. 11 4.6 Flash Fonts .................................................................................................. 11 4.7 CSS Fonts.................................................................................................... 11 4.8 Time zone .................................................................................................... 13 4.9 Mobile Brand .............................................................................................. 14 4.10 IP ............................................................................................................... 14 5. 分群及分類使用者................................................................................................. 15 5.1 Canopy 分群演算法 ...................................................................................... 16 5.2 分群特徵選擇............................................................................................... 19 5.3 分群後的資料為訓練集分類測試集............................................................ 36 6. 實作Flood/Karlsson Framework model ................................................................ 38 6.1 Comparison between Naïve bayes and KNN implementation ...................... 39 6.1.1 10-fold validation ............................................................................. 42 6.2 分類時間分析............................................................................................. 45 6.2.1 Classify monthly ................................................................................. 46 6.2.2 Classify weekly ................................................................................... 52 6.3 Using the clustering as the basis for classification to implement ERIK FLOOD , JOEL KARLSSON Classifier model ................................................... 54 7. 結論......................................................................................................................... 57 8. 未來研究方向......................................................................................................... 59 9. 參考文獻................................................................................................................. 60

    [1] P. Eckersley (2010). "How Unique Is Your Web Browser?" In M.
    Atallah & N. Hopper (eds.), Privacy Enhancing Technologies, vol. 6205
    of Lecture Notes in Computer Science, pp. 1{18. Springer Berlin /
    Heidelberg.
    [2] ro o a t F or r S n or
    Imre1 "User Tracking on the Web via Cross-Browser Fingerprinting"
    [3] The effectiveness of a browser fingerprint as a tool for
    tracking Mick Vaites February 2013
    [4]A Proposal of a Cross-Browser User Tracking Method with Browser
    Fingerprint ,Vu Xuan Duong Faculty of Environment and Information
    Studies
    Keio University 5322 Endo Fujisawa Kanagawa 252-0882 JAPAN
    [5] ERIK FLOOD , JOEL KARLSSON. Browser Fingerprinting - Master
    of Science Thesis in the Program Computer Science
    ro o a t F or r S n or.
    User Tracking on the Web via Cross-Browser Fingerprinting
    [7] Mick Vaites. The effectiveness of a browser fingerprint as a tool for
    tracking
    [8] Ting-Fang Yen, Yinglian Xie, Fang Yu, Roger Peng Yu, Martin Abadi.
    "Hot Fingerprinting and Tracking on the Web: Privacy and Security
    Implications"
    [9] Greory Fleischer. "Implementing Web Tracking"
    0 at ria orrow fro Jonathan H an an I. H. Witt n’ an E.
    Frank’ “Data inin ” an J r W att an oth r . Naïv a es
    Classification
    [11] Xindong Wu, Vipin Kumar, J. Ross Quinlan, Joydeep Ghosh, Qiang
    Yang, Hiroshi Motoda, Geoffrey J. McLachlan, Angus Ng, Bing
    Liu, Philip S. Yu, Zhi-Hua Zhou, Michael Steinbach, David J. Hand,Dan
    Steinberg. Top 10 algorithms in data mining
    [12]RFC 3261, SIP: Session Initiation Protocol, IETF, The Internet
    Society (2002)
    [13]McCallum, A.; Nigam, K.; and Ungar L.H. (2000) "Efficient
    Clustering of High Dimensional Data Sets with Application to Reference
    Matching", Proceedings of the sixth ACM SIGKDD international
    conference on Knowledge discovery and data mining,

    QR CODE