研究生: |
鄭志欣 Jhih-sin Jheng |
---|---|
論文名稱: |
結合最鄰近演算法與K平均演算法於網頁推薦系統之研究 Applying KNN-means in Web Recommendation System |
指導教授: |
鮑興國
Hsing-Kuo Kenneth Pao |
口試委員: |
李育杰
Yuh-Jye Lee 項天瑞 Tien-Ruey Hsiang 鄧惟中 Wei-Chung Teng |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2012 |
畢業學年度: | 100 |
語文別: | 中文 |
論文頁數: | 53 |
中文關鍵詞: | 最近鄰居法 、K平均演算法 、推薦系統 |
外文關鍵詞: | K-nearest neighbor, K-means clustering, Recommendation system |
相關次數: | 點閱:248 下載:1 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來許多學者針對網頁的推薦系統進行相關研究,透過網頁所蘊含的資訊,幫助使用者過濾龐大的網頁資料量,讓使用者能夠快速地找到欲查詢的網頁資料。本論文結合最近鄰居演算法與K平均演算法於推薦系統之研究,研究目的為,推薦使用者其感興趣的網頁,與提升使用者瀏覽時的便利程度。我們的推薦系統將分析使用者目前正在瀏覽的網頁之所屬網站,以推薦額外的網頁給使用者。此系統結合瀏覽器對使用者進行推薦,讓使用者能夠直接在網頁上點擊系統所推薦的網頁進行瀏覽。本研究針對使用者目前瀏覽的網頁,進行網頁資訊的收集。主要透過已收集的網頁,使用中文斷詞系統分析每一個網頁的標題和內容,並且採用TF-IDF計算關鍵字等資訊,再藉由KNN-means演算法進行分群,以關鍵字雲(Tag Cloud)的方式呈現予使用者,讓使用者能夠快速地找到感興趣的網頁進行瀏覽。
In recent years, many researchers focus on web recommendation systems, which filter a lot of web pages, in order to let users find desired web page information quickly.
We proposed a recommendation system which combines with K-nearest neighbor algorithm and K-means clustering algorithm. When the user browses the current web page, our system will recommend additionally interesting web pages to the user after our system analyze the current website. There are four processing phases in our recommendation system. First, we collected the website that user current browsed. Second, we used the CKIP (Chinese knowledge Information Processing Group) system to analyze every web page titles and contents. Then we compute keywords information by TF-IDF. Third, we propose KNN-means to cluster our web pages. Forth, we show tag-cloud as our clustered result.
[1]Netcraft. http://news.netcraft.com/, Accessed June, 2012.
[2]L. Terveen and W. Hill. Beyond recommender systems: Helping people help each other. In HCI in the New Millennium, pages 487-509. Addison-Wesley, 2001.
[3]X. Su and T. M. Khoshgoftaar. A survey of collaborative filtering techniques. Advances in Artificial Intelligence, 2009:4:2-4:2, Jan. 2009.
[4]J. Wang, A. P. de Vries, and M. J. T. Reinders. Unifying user-based and item-based collaborative filtering approaches by similarity fusion.In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '06, pages 501-508, New York, NY, USA, 2006. ACM.
[5]G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6):734-749, 2005.
[6]M. J. Pazzani. A framework for collaborative, content-based and demographic filtering. Artificial Intelligence Review, 13(5-6):393-408, Dec. 1999.
[7]中文詞知識庫小組, http://godel.iis.sinica.edu.tw/CKIP/index.htm, Accessed June, 2012.
[8]中文斷詞系統(CKIP), http://ckipsvr.iis.sinica.edu.tw/ , Accessed June, 2012.
[9]G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York, NY, USA, 1986.
[10]S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst., 30(1-7):107-117, Apr. 1998.
[11]L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab, November 1999.
[12]J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pages 281-297, 1967.
[13]D. J. C. MacKay. Information Theory, Inference & Learning Algorithms. Cambridge University Press, New York, NY, USA, 2002.
[14]P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining,(First Edition). Addison-Wesley Longman Publishing Co., Inc., Boston,MA, USA, 2005.
[15]套件開發指南-Googlebar Lite, http://wiki.moztw.org/%E5%A5%97%E4%BB%B6%E9%96%8B%E7%99%BC%E6%8C%87%E5%8D%97_-_Googlebar_Lite, Accessed June, 2012.
[16]Tag-cloud, http://www.tagcloud-generator.com, Accessed June, 2012.
[17]臺北市政府觀光傳播局臺北旅遊網, http://taipeitravel.net/tw/food/, Accessed June, 2012.
[18]台灣科技大學建築系暨建築研究所, http://www.ntust.edu.tw/, Accessed June, 2012.
[19]K. S. Jones. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28:11-21, 1972.
[20]博客來網路書店, http://www.books.com.tw, Accessed June, 2012.
[21]推特(Twitter), https://twitter.com/, Accessed July, 2012.
[22]亞馬遜網路書店, http://www.amazon.com, Accessed June, 2012.
[23]Python, http://www.python.org, Accessed June, 2012.
[24]Django, https://www.djangoproject.com, Accessed June, 2012.
[25]T. Cover and P. Hart. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13:21-27, 1967.
[26]X. Jin, Y. Zhou, and B. Mobasher. A maximum entropy web recommendation system: combining collaborative and content features. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, KDD '05, pages 612-617, NewYork, USA, 2005.
[27]B. Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications). Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006.
[28]iPeen愛評網, http://www.ipeen.com.tw, Accessed July, 2012.