簡易檢索 / 詳目顯示

研究生: 林璞雍
Pu-Yong Lin
論文名稱: 基於微博平台用戶興趣的個性化推薦方法研究
Research on Personalized Recommendation Method Based on Micro-Blog Users' Interest
指導教授: 李育杰
Yuh-Jye Lee
鮑興國
Hsing-Kuo Pao
口試委員: 項天瑞
Tien-Ruey Hsiang
蘇黎
Li Su
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2016
畢業學年度: 104
語文別: 英文
論文頁數: 37
中文關鍵詞: 微博主題模型隱形狄里克萊分佈相似度
外文關鍵詞: micro-blog, topic model, LDA, similarity
相關次數: 點閱:302下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著 Web3.0的火熱興起,社交網路的發展異常火熱。人人網,微博,Qzone等社交網路的快速發展吸引了大批量的用戶。近幾年,在國內社交網路方面,微博的快速發展和它獨有的開放性、草根性、即時性模式吸引了大量的使用者。越來越多的人通過微博獲取即時的資訊來瞭解身邊、社會、甚至世界發生的事情。但是隨著微博用戶的越來越多,微博上面每天有數億計的微博內容產生,微博使用者面臨著嚴重的“資訊超載”現象。如何讓微博使用者從“資訊超載“的環境中脫離出來,從而在微博網路中找到使用者感興趣的用戶,通過找到他們感興趣的用戶來使他們獲得他們感興趣的資訊資訊是一個值得研究的問題。這對於微博用戶提升使用體驗,快速獲得需要的資訊都是非常重要的。

    在本篇論文中,我們研究了微博社交網路平臺。在這其中我們主要分析並研究了基於微博平臺用戶的個性化推薦,設計並完善了使用者個性化推薦演算法。其中運用了LDA主題模型進行了使用者主題模型的建構以及使用者興趣分佈的預測。從結果上來看,LDA主題模型的加入對於推薦使用者準確性的提升有較大幫助。


    With the recent rise of fiery web3.0, the development of the social network is very hot. The rapid development of the social network such as RenRen, QZone, micro-blog, Facebook have attracted large number of users. In recent years, in terms of domestic social network, the micro-blog is very fascinating. More and more people know the news nearby, society, the world by the real-time information on micro-blog platform. But with the increasing number of the micro-blog users, there is billions of micro-blog information every day. The micro-blog users face the serious phenomenon of "information overload". How to let a micro-blog user get away from the phenomenon and find the information that interested in the micro-blog through the users that interested in is a problem worthy of studying. This study is very important for the improvement of users' usage experience and the quick access to information.

    In this thesis, we research an important aspect of social network micro-blog, and we mainly analyze and study the personalized recommendation problem on micro-blog users. Research and design the user personalized recommendation algorithm. Therein, we use LDA to construct the topic model of users and predict the interest distribution of target users. The result shows that it has a improvement for the accuracy of users prediction to user LDA topic model.

    1 Introduction 1.1 Background 1.2 Motivation 1.3 Our Main Work 1.4 Organization of thesis 2 Related Work 2.1 Micro-blog Correlation Work 2.2 User interest mining 3 Information Source User and Derivation of LDA 3.1 Information Source User 3.2 Derivation of LDA 4 Calculation of Micro-Blog User Similarity 4.1 Construction of Micro-Blog Users' Information 4.2 Calculation of User Similarity 4.2.1 Calculation of User Relation Similarity 4.2.2 Calculation of Tweet Similarity 4.2.3 Calculation of Interaction Similarity 5 Experiment and Result 5.1 Dataset 5.2 Setting 5.2.1 Analytic hierarchy process 5.2.2 The Number of Topic K 5.3 Tools 5.4 Evaluation 6 Conclusion and Future Work 6.1 Conclusion 6.2 Future Work

    [1] Quentin Jones, Gilad Ravid, and Sheizaf Rafaeli. Information overload and the message dynamics of online interaction spaces: A theoretical model and empirical exploration. Information systems research, 15(2):194-210, 2004.

    [2] Akshay Java, Xiaodan Song, Tim Finin, and Belle Tseng. Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, pages 56-65. ACM, 2007.

    [3] Dejin Zhao and Mary Beth Rosson. How and why people twitter: the role that micro-blogging plays in informal communication at work. In Proceedings of the ACM 2009 international conference on Supporting group work, pages 243-252. ACM, 2009.

    [4] Stefan Stieglitz and Linh Dang-Xuan. Emotions and information di usion in social media|sentiment of microblogs and sharing behavior. Journal of Management Information Systems, 29(4):217-248, 2013.

    [5] Michael J Welch, Uri Schonfeld, Dan He, and Junghoo Cho. Topical semantics of twitter links. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 327{336. ACM, 2011.

    [6] Antti Oulasvirta, Esko Lehtonen, Esko Kurvinen, and Mika Raento. Making the ordinary visible in microblogs. Personal and ubiquitous computing, 14(3):237-249, 2010.

    [7] Ryong Lee, Shoko Wakamiya, and Kazutoshi Sumiya. Discovery of unusual regional social activities using geo-tagged microblogs. World Wide Web, 14(4):321-349, 2011.

    [8] Nilesh Bansal and Nick Koudas. Blogscope: a system for online analysis of high volume text streams. In Proceedings of the 33rd international conference on Very large data bases, pages 1410-1413. VLDB Endowment, 2007.

    [9] Jianshu Weng, Ee-Peng Lim, Jing Jiang, and Qi He. Twitterrank: finding topic-sensitive in uential twitterers. In Proceedings of the third ACM international conference on Web search and data mining, pages 261-270. ACM, 2010.

    [10] Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: bringing order to the web. 1999.

    [11] Nilanjan Banerjee, Dipanjan Chakraborty, Koustuv Dasgupta, Sumit Mittal, Anupam Joshi, Seema Nagar, Angshu Rai, and Sameer Madan. User interests in social media sites: an exploration with micro-blogs. In Proceedings of the 18th ACM conference on Information and knowledge management, pages 1823-1826. ACM, 2009.

    [12] Xiumei Yuan and Pujun Wu. Content-based recommendation model in micro-blogs community. In Management of e-Commerce and e-Government (ICMeCG), 2012 International Conference on, pages 165-168. IEEE, 2012.

    [13] Yunfei Ma, Yi Zeng, Xu Ren, and Ning Zhong. User interests modeling based on multi-source personal information fusion and semantic reasoning. In International Conference on Active Media Technology, pages 195{205. Springer Berlin Heidelberg, 2011.

    [14] Jilin Chen, Rowan Nairn, Les Nelson, Michael Bernstein, and Ed Chi. Short and tweet: experiments on recommending content from information streams. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1185-1194. ACM, 2010.

    [15] Wei Wu, Bin Zhang, and Mari Ostendorf. Automatic generation of personalized annotation tags for twitter users. In Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics, pages 689-692. Association for Computational Linguistics, 2010.

    [16] Thomas Hofmann. Probabilistic latent semantic indexing. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 50-57. ACM, 1999.

    [17] David M Blei, Andrew Y Ng, and Michael I Jordan. Latent dirichlet allocation.

    Journal of machine Learning research, 3(Jan):993-1022, 2003.

    [18] Michal Rosen-Zvi, Thomas Gri ths, Mark Steyvers, and Padhraic Smyth. The author-topic model for authors and documents. In Proceedings of the 20th conference on Uncertainty in arti cial intelligence, pages 487-494. AUAI Press, 2004.

    [19] Balachander Krishnamurthy, Phillipa Gill, and Martin Arlitt. A few chirps about twitter. In Proceedings of the rst workshop on Online social networks, pages 19-24. ACM, 2008.

    [20] Christopher M Bishop. Pattern recognition and machine learning (information science and statistics) springer-verlag new york. Inc. Secaucus, NJ, USA, 2006.

    [21] Thomas L Gri ths and Mark Steyvers. Finding scienti c topics. Proceedings of the National academy of Sciences, 101(suppl 1):5228-5235, 2004.

    [22] fxsjy. "Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best python Chinese word segmentation module. https://github.com/fxsjy/jieba, 2014.

    [23] Cam-Tu Nguyen Xuan-Hieu Phan. A C/C++ implementation of latent dirichlet allocation. http://gibbslda.sourceforge.net, 2007.

    [24] 《2015年中国社交用户行为研究报告》

    [25] 韩忠明, 张玉沙, 张慧, 等. 有效的中文微博短文本倾向性分类算法[J]. 计算机应用与软件, 2012, 29(10): 89-93.

    [26] 杨长春, 俞克非, 施仁, 等. 一种新的中文微博社区博主影响力的 估方法[J]. 计算机工程与应用, 2012.

    [27] 王晟, 王子琪, 张铭. 个性化微博推荐算法[J]. 计算机科学与探索, 2012, 6(10): 895-902.

    QR CODE