研究生: |
姚昭宇 Eric Dao |
---|---|
論文名稱: |
應用網路域名位置特徵於監督式機器學習的詐騙域名偵測 Application of Network Domain Location Features in Supervised Machine Learning for Ecommerce Scam Domain Detection |
指導教授: |
鄧惟中
Wei-Chung Teng |
口試委員: |
黃勝雄
Kenny Huang 鮑興國 Hsing-Kuo Pao 陳俊良 Jiann-Liang Chen |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 英文 |
論文頁數: | 68 |
中文關鍵詞: | domain 、location 、feature 、ecommerce 、scam 、network |
外文關鍵詞: | domain, location, feature, ecommerce, scam, network |
相關次數: | 點閱:217 下載:20 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
Ecommerce scam is a cybercrime that affects online consumer shoppers from nearly every country. Criminal groups implement deceiving ecommerce websites that lure consumers into purchasing their products, only to make away with the consumer’s money without giving the consumer what they had promised to sell them. Researchers have utilized a variety of domain features, from website HTML source code features to a domain’s DNS features to create frameworks that could identify ecommerce scam websites. However, much of the previous literature regarding this subject matter has neglected the potentially advantageous use of a domain’s location data to differentiate ecommerce scam websites from benign ecommerce websites. In this thesis, to find novel ways to combat ecommerce scam, the potential application of a domain’s location data as novel features to detect ecommerce scam websites was investigated.
The first finding is that through testing with supervised machine learning models, it was discovered that our novel domain location features, in the form of domain location co-occurrences and geographical distances are effective features to detect ecommerce scam domains. Secondly, to our knowledge, we are the first researchers to have done a detailed analysis of domain location features between benign and scam ecommerce domains. To which, it was revealed that the location features of ecommerce scam domains, in comparison with benign ecommerce domains, tended to have much lower location co-occurrences and larger location distances with the country that they were marketing towards. Thirdly, an analysis was performed on the location features in our dataset at a local country level and to our knowledge, we are the first researchers to reveal the current trends in domain location data for ecommerce scam and benign websites in Taiwan. To which, it was discovered that ecommerce scam domains in Taiwan, in comparison to benign ecommerce domains in Taiwan, evidently possessed more location associations with China and less or none with Taiwan. Conversely, benign ecommerce domains in Taiwan, tended to have more location associations with Taiwan, and less or none with China. Therefore, this could serve as strong evidence to suspect that for foreign scam groups targeting a specific country, it is difficult, risky, and or costlier to ensure their scam domain’s various location data are located in the target country. Hence, the novel domain location features introduced in this thesis appear to be viable features in the detection of ecommerce scam domains, since they are likely not domain data features that scam groups are able to adapt to at a whim to evade detection.
[1] "Top Sites of HK," [Online]. Available: https://www.alexa.com/topsites/countries/HK.
[2] "countries.csv," [Online]. Available: https://developers.google.com/public-data/docs/canonical/countries_csv. [Accessed 10 03 2021].
[3] "Root Zone Database," [Online]. Available: https://www.iana.org/domains/root/db. [Accessed 10 03 2021].
[4] "Taiwan Internet Report 2019," TWNIC, 2019.
[5] J. D. a. T. M. John Wadleigh, "The E-Commerce Market for “Lemons”: Identification and Analysis of Websites Selling Counterfeit Goods," in Proceedings of the 24th International Conference on World Wide Web (WWW '15), Republic and Canton of Geneva, 2015.
[6] "ScamAdviser," [Online]. Available: https://www.scamadviser.com/. [Accessed 01 03 2021].
[7] "What Is DNS? | How DNS Works," [Online]. Available: https://www.cloudflare.com/learning/dns/what-is-dns/. [Accessed 01 03 2021].
[8] "Number Resources," [Online]. Available: https://www.iana.org/numbers. [Accessed 11 03 2021].
[9] "About WHOIS," [Online]. Available: https://whois.icann.org/en/about-whois. [Accessed 11 03].
[10] E. Schechter, "A secure web is here to stay," 08 02 2018. [Online]. Available: https://security.googleblog.com/2018/02/a-secure-web-is-here-to-stay.html. [Accessed 21 03 2021].
[11] W. &. Z. B. &. W. M. Mostard, "Combining Visual and Contextual Information for Fraudulent Online Store Classification," IEEE, pp. 84-90, 2019.
[12] "About Scamadviser," [Online]. Available: https://www.scamadviser.com/about-scamadviser. [Accessed 01 03 2021].
[13] B. a. V. A. a. W. K. Eshete, "BINSPECT: Holistic analysis and detection of malicious web pages," in International Conference on Security and Privacy in Communication Systems, 2013.
[14] L. K. S. S. S. G. M. V. Matthew F. Der, "Knock It Off: Profiling the Online Storefronts of Counterfeit Merchandise," Association for Computing Machinery, 2014.
[15] R. L. D. C. C. L. a. X. L. S. Hao, "Inconsistency between Domain Name and Server Location: Phenomena, Causes, and Countermeasures," in Electronic and Automation Control Conference (IMCEC), Chongqing, 2019.
[16] T. a. S. J. a. A. P. a. J. D. a. J. P. a. W. M. a. P. F. a. J. W. a. D. L. Vissers, "Exploring the Ecosystem of Malicious Domain Registrations in the .eu TLD," in International Symposium on Research in Attacks, Intrusions, and Defenses, 2017.
[17] "About Us," [Online]. Available: https://www.twnic.tw/en_about_mission.php. [Accessed 14 06 2021].
[18] C. &. Romano, "Learning to detect and measure fake ecommerce websites in search-engine results," Association for Computing Machinery, 2017.
[19] "Security Trails reverse IP lookup," [Online]. Available: https://securitytrails.com/list/ip/8.8.8.8. [Accessed 01 01 2021].
[20] 張銘億, "An E-commerce Scam Website Detection Framework Based on Syntactic Similarity ofHTML Code and Conversion Tracking Identity," 2019.
[21] "廣告小撇步:使用2020 Q1熱搜關鍵字來增加曝光和流量吧!," [Online]. Available: https://myads.shopee.tw/news/258. [Accessed 1 2 2021].
[22] anthonyhseb, "google-search," [Online]. Available: https://github.com/anthonyhseb/googlesearch. [Accessed 1 2 2021].
[23] "How Search algorithms work," [Online]. Available: https://www.google.com/search/howsearchworks/algorithms/. [Accessed 10 03 2021].
[24] "Google Safe Browsing," [Online]. Available: https://safebrowsing.google.com/#:~:text=Google%20Safe%20Browsing%20helps%20prot
ect,sites%20or%20download%20dangerous%20files.&text=Our%20Transparency%20Rep
ort%20includes%20details%20on%20the%20threats%20that%20Safe%20Browsing%20id
entifies.. [Accessed 10 03 2021].
[25] "Taiwan Network Infromation Center," [Online]. Available: https://rs.twnic.net.tw/. [Accessed 11 03 2021].
[26] "ip2geotools," [Online]. Available: https://github.com/tomas-net/ip2geotools. [Accessed 11 03 2021].
[27] E. MARTINEZ, "VirusTotal += Passive DNS replication," 1 04 2013. [Online]. Available: https://blog.virustotal.com/2013/04/virustotal-passive-dns-replication.html. [Accessed 11 03 2021].
[28] "python-whois," [Online]. Available: https://github.com/richardpenman/whois. [Accessed 11 03 2021].
[29] "geopy," [Online]. Available: https://github.com/geopy/geopy. [Accessed 01 03 2021].
[30] O. S. Research, "The role of country code top-level domains (ccTLDs) in malware classification," 18 01 2013. [Online]. Available: https://umbrella.cisco.com/blog/the-importance-of-cctld. [Accessed 01 03 2021].
[31] "Top Sites in Taiwan," [Online]. Available: https://www.alexa.com/topsites. [Accessed 01 01 2020].
[32] "Top Sites in China," [Online]. Available: https://www.alexa.com/topsites/countries/CN. [Accessed 01 01 2021].
[33] "Top Sites in United States," [Online]. Available: https://www.alexa.com/topsites/countries/US. [Accessed 01 01 2021].
[34] "What Is DNS? | How DNS Works," [Online]. Available: https://www.cloudflare.com/learning/dns/what-is-dns/. [Accessed 11 03 2021].
[35] "Geodesic," [Online]. Available: https://en.wikipedia.org/wiki/Geodesic. [Accessed 18 03 2021].
[36] Y. a. K. I. a. Y. T. a. D. M. Zhauniarovich, "A Survey on Malicious Domains Detection through DNS Data Analysis," in ACM Computing Surveys, 2018.