簡易檢索 / 詳目顯示

研究生: 鄭惠如
Hui-ju Chen
論文名稱: 熱門影片預測系統-基於使用者興趣關係的社群網路之研究
Hot Video Prediction System Based on User Interesting Social Network
指導教授: 李漢銘
Hahn-Ming Lee
口試委員: 陳振楠
Jenn-Nan Chen
徐讚昇
Tsan-sheng Hsu
賴溪松
Chi-Sung Laih
李育杰
Yuh-Jye Lee
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 62
中文關鍵詞: 線上影片預測使用者興趣社群網路
外文關鍵詞: online video prediction, user interesting, social network
相關次數: 點閱:192下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

隨著網際網路的流行,各種線上資訊服務,例如搜尋引擎、部落格及網路相片及線上影音服務也隨之興起。其中線上影音(Online Video)是在近幾年內才流行的一項新興線上服務。由於攝錄影機的普及,許多人都容易攝錄影片及並製作影片,更是吸引大量的使用者自製影片的上傳。也因此,如何吸引瀏覽影片的人群目光,並且搭配商業廣告的播放,已經是網站經營者關注的議題。然而目前的研究議題大多數著重於如何在適當的影片上播放吸引使用者瀏覽的廣告,而非預測哪些線上影片的內容會受到歡迎,也因此,本論文主要的動機就是預測哪些新的線上影片即將熱門以取得最大的廣告效益。
目前己經有許多關於線上熱門內容的偵測的研究,例如,熱門新聞主題、熱門網站以及熱門的部落格。這類型的熱門主題的偵測均是依賴大量且豐富的文字資訊,藉由計算文字向量的相似度來聚類相同主題的內容。然而,線上影片的文字資訊缺乏,並且文字資訊的品質也不如前述線上內容好。此外,線上影片的更新速度非常快,每天都有大量新影片上傳,也因此,熱門的線上影片的流行與其它的線上內容相較之下快速。也因此,熱門影片的主題變動快速,熱門主題的涵蓋範圍相當廣泛。使用熱門影片的文字資訊來進行聚類以做為區別影片是否熱門,將會遭遇下列二種問題。首先,稀疏的文字資訊無法提供足夠的鑑別資訊以判斷新的影片是否會熱門。再來,影片的重複性問題使得使用文字資訊來識別影片是否熱門會有很高的誤判率。影片的重複性就是相同主題的影片可能會有許多內容差異不大的複製影片上傳,但其中只有少數是熱門的影片,多數是不熱門的影片。
因此,在本論文中提出一個具有高鑑別度的資訊來預測影片是否熱門。一部影片熱門的現象,可視為一群有相同意見的使用者共同投票的結果。我們希望籍由塑模使用者對影片內容喜好的關係來反映目前熱門影片趨勢。利用影片的留言資訊建構使用者興趣的關聯資訊來反映一部影片的潛在主題,也就是過去一群曾共同留言給相同主題的熱門影片的使用者,能用來鑑別新的影片是否也是相同主題,如此即可判斷這部影片未來也即將熱門。因此,使用者興趣關係的社群網路(UISN)被建構,以反映目前熱門影片的趨勢。根據新影片初期給予留言的使用者之間興趣的相似度,來判斷該影片是否與過去熱門影片主題相似,來預測該影片是否即將熱門。


Content-targeted advertising is a popular advertising strategy. The goal of content-targeted advertising is to associate ads with appropriate web contents that can reach a large number of targeted customers. However, searching hot videos by analyzing video contents will cause higher False Positive Rate, due to the characteristics of videos: massive amounts, fast update, and redundancy. Besides, searching hot videos by analyzing insufficient time-series data causes lower accuracy, due to online video’s fast burst and obsolescence nature. For improving the accuracy of prediction, we utilize user social context to alleviate the variation of video content and to improve the insufficient data problem in early prediction stage. In this paper, the UISN is constructed to represent the hot videos’ tendency by modeling user interest relation. The main idea of the proposed system is to identify cohesive subgroups of users with similar interests, so that it can be utilized to predict possible online videos that most people might feel interested. Finally, the UISN is adapted to new change of user interest over time. By using UISN to enhance insufficient information in early prediction stage, the proposed system can effetely predict hot videos. In addition, UISN can alleviate hot video prediction inaccuracy caused by the characteristics of online videos. Furthermore, by adapting user interest change and filtering noisy users, the FP-rate can be controlled under 2%, in the meanwhile; video prediction accuracy is slightly decrease.

Abstract                      I Contents                      IV List of Figures and Tables              VI Chapter 1 Introduction                  1 1.1 Hot Video Prediction on Online Video Sharing      Community                   1 1.2 The Challenges of Current Research 2 1.3 Motivations 4 1.4 Goals 6 1.5 The Outline of Thesis 6 Chapter 2 Background and Related Work 8 2.1 The Online Video Sharing Community Website 9 2.2 The Related Work of Comment Data 10 2.3 The Related work of Hot Topic Detection 11 2.4 The Social Network Application 12 Chapter 3 Hot Video Prediction System 14 3.1 Concept of User Interesting Social Network (UISN) 15 3.2 The System Architecture of UISN 18 3.3 User Interest Social Network (UISN) Constructor 19 3.4 Hot Video Prediction Stage 21 3.4.1 Spreading activation 23 3.4.2 Cohesive subgroup detection 25 3.5 Adaptive Learning Stage 27 3.5.1 The prediction result checking unit 28 3.5.2 User profile updating unit 29 3.5.3 UISN Pruning unit 31 Chapter 4 Experiments 33 4.1 Description of Data Set 34 4.2 Evaluation Design 36 4.3 UISN Parameter Setting 38 4.3.1 The effect of different user set 39 4.3.2 The effect of different K-core setting 41 4.3.3 The effect of user extension 43 4.3.4 The pruning threshold setting 44 4.3.5 The summarization of experiment results 45 4.4 The Performance of Hot Video Prediction System 46 4.4.1 The overall performance of proposed system 47 4.4.2 The prediction delay evaluation 48 4.4.3 The analysis of false prediction result 51 4.4.4 The summarization of overall prediction performance 53 Chapter 5 Conclusion and Further Work 54 5.1 Conclusion 54 5.2 Further Work 55 References 58 Vita 62

[1] G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6): 734-749, 2005.
[2] N. Agarwal, H. Liu, L. Tang and P.S. Yu. Identifying the influential bloggers in a community. In Proceedings of the ACM international conference on Web search and web data mining, 207-218, 2008.
[3] S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks, 30(1-7):107-117, 1998.
[4] D.C. Bell, J.S. Atkinson and J.W. Carlson. Centrality measures for disease transmission networks, Soc Networks, 21: 1-21, 1999.
[5] J. Brown and P. Reinegen. Social ties and word-of-mouth referral behavior. Journal of Consumer Research, 14(3): 350-362, 1987.
[6] V. Batagelj and M. Zaversnik. An O(m) Algorithm for Cores Decomposition of Networks. cs.DS/0310049 , 1-10, 2003.
[7] Y. Chi, B.L. Tseng and J. Tatemura. Eigen-trend: trend analysis in the blogosphere based on singular value decompositions. In CIKM , 68–77, 2006.
[8] P.J. Carrington, J. Scott and S. Wasserman. Models and Methods in Social Network Analysis. Cambridge University Press, 2005.
[9] ComScore. http://www.comscore.com/.
[10] E. Chang, M. Davis, P. Schmitz and S. Boll. Panel Discussion: Web 2.0 and Multimedia: Challenge, Hype, Synergy. In Proceedings of the ACM MULTIMEDIA, 752, 2006.
[11] K.Y. Chen, L. Luesukprasert and T.C. Seng-cho . Hot topic extraction based on timeline analysis and multidimensional sentence modeling. IEEE Transactions on Knowledge and Data Engineering,19(8): 1016-1025,2007.
[12] R.M. Christley, G.L. Pinchbeck, R.G. Bowers, D. Clancy, N.P. French, R. Bennett and J. Turner. Infection in social networks: Using network analysis to identify high-risk individuals, Amer. J. Epidemiology. 162(10): 1024-1031, 2005.
[13] M. Cha, H. Kwak, P. Rodriguez, Y.Y. Ahn and S. Moon. I tube, you tube, everybody tubes: Analyzing the world's largest user generated content video system. In Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC: 1-14, 2007.
[14] Fox Interactive Media. http://www.fox.com/.
[15] K. Fukui, K. Saito, M. Kimura, and M. Numao. Visualizing dynamics of the hot topics using sequence-based self organizing maps. LNAI, 3684:745–751, 2005.
[16] Google content-targeted advertising.
https://adwords.google.com/select/afc.html
[17] M. Gumbrecht. Blogs as “protected space”. In WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2004, at WWW ’04: the 13th international conference on World Wide Web, 2004.
[18] N. Glance, M. Hurst and T. Tomokiyo. Automated trend discovery for weblogs. Workshop on the Weblogging Ecosystem: Analysis and Dynamics, 2004, 2004.
[19] P. Gill, M. Arlitt, Z. Li and A. Mahanti. YouTube traffic characterization: A view from the edge. In Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC: 15-28, 2007.
[20] Z. Huang, H. Chen and D. Zeng. Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Transactions on Information Systems, 22(1): 116-142, 2004.
[21] C.H. Huang, H.T. Kung, C.Y. Su. Use of Content Tags in Managing Advertisements for Online Videos. To appear in 10th IEEE Conference on E-Commerce Technology, 2008.
[22] HtmlUnit. http://htmlunit.sourceforge.net/.
[23] A. Java, P. Kolari, T. Finin, and T. Oates. Modeling the spread of influence on the blogosphere, In Proceedings of the 15th International World Wide Web Conference, 2006.
[24] D. Kempe, J. Kleinberg and E. Tardos. Maximizing the Spread of Influence through a Social Network, In Proceedings of the ACM SIGKDD, 137-146, 2003.
[25] S. Krishnamurthy. The multidimensionality of blog conversations: The virtual enactment of September 11. In Internet Research 3.0, 2002.
[26] J.M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604-632, 1999.
[27] J. Li, S.F. Chang, M. Lesk, R. Lienhart, J. Luo and A.W.M. Smeulders. New challenges in multimedia research for the increasingly connected and fast growing digital society. In Proceedings of the ACM International Multimedia Conference and Exhibition, 3-10, 2007.
[28] C. Marlow. Audience, structure and authority in the weblog community. In The 54th Annual Conference of the International Communication Association, 1-9, 2004.
[29] G. Mishne and N. Glance. Leave a replay: An analysis of weblog comments. In Workshop on the Weblogging Ecosystem, 15th International World Wide Web Conference, 2006.
[30] T. Mei, L. Yang, X.S. Hua, H. Wei, and S. Li. VideoSense: A contextual video advertising system. In Proceedings of the ACM International Multimedia Conference and Exhibition, 463-464, 2007.
[31] T. Mei, X.S. Hua, L. Yang and S. Li. VideoSense: Towards effective online video advertising. In Proceedings of the ACM International Multimedia Conference and Exhibition, 1075-1084, 2007.
[32] V. Mahajan and E. Muller, When Is It Worthwhile Targeting the Majority Instead of the Innovators in a New Product Launch?, Journal of Marketing Research, 35, 488-95, Nov 1998.
[33] V. Mahajan, E. Muller and R.K. Srivastava, Determination of Adopter Categories by Using Innovation Diffusion Models, Journal of Marketing Research, 27, 37-50, 1990.
[34] MySpace Video. http://vids.myspace.com/.
[35] M. Papagelis, D. Plexousakis and T. Kutsuras. Alleviating the sparsity problem of collaborative filtering using trust inferences. In 3rd International Conference on Trust Management (iTrust 2005), 224-239,2005.
[36] P. Rusmevichientong, S. Zhu and D. Selinger, Identifying Early Buyers from Purchase Data, In Proceedings of the ACM SIGKDD, 671-676, 2004.
[37] E.M. Rogers. Diffusion of Innovations. The Free Press: New York, 1995.
[38] X. Song, B.L. Tseng, C.Y. Lin and M.T. Sun. Personalized recommendation driven by information flow. In Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2006, 509-516, 2006.
[39] X. Song, Y. Chi, K. Hino and B. Tseng. Identifying opinion leaders in the blogosphere. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, 971-974, 2007.
[40] E.M. Trevino. Blogger motivations: Power, pull, and positive feedback. In Internet Research 6.0, 2005.
[41] T. Valente, Network Models of the Diffusion of Innovations, Hampton Press, 163-164, 1995.
[42] S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press.
[43] W.S. Yang and J.B. Dia. Discovering cohesive subgroups from social networks for targeted advertising. Expert Systems with Applications, 34(3): 2029-2038, 2008.
[44] Yahoo! Video. http://video.yahoo.com/.
[45] YouTube. http://www.youtube.com/. 

QR CODE