簡易檢索 / 詳目顯示

研究生: Ivanchuk Iaroslav
Ivanchuk Iaroslav
論文名稱: 影音串流平台影片名稱聚類分析:以Netflix為例
CLUSTERING FOR VIDEO STREAMING PLATFORM TITLES: USING NETFLIX DATA AS AN EXAMPLE
指導教授: 林希偉
Shi-Woei Lin
口試委員: 王孔政
Kung-Jeng Wang
曾世賢
Shih-Hsien Tseng
學位類別: 碩士
Master
系所名稱: 管理學院 - 管理學院MBA
School of Management International (MBA)
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 89
中文關鍵詞: 群集分析推薦系統影音串流平台基於產品的推薦系統Netflix特徵工程
外文關鍵詞: Clustering, Recommendation system, Video streaming platform, Products-based recommendations, Netflix, features engineering
相關次數: 點閱:217下載:6
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 快速發展的影音串流媒體行業正已成為娛樂媒體市場的重要板塊。領先的串流媒體平台往往應用基於複雜機器學習技術的各種服務來擴大平台的知名度和客戶滿意度,其中電影推薦系統可提供巨大商業價值,且隨著演算法的快速發展,推薦系統往往包含不同類型與功能的機器學習方法。然而,群集分析仍然是串流服務推薦系統所需的重要技術之一。本研究採用群集分析探討論影音串流平台領導者和先驅Netflix 中列出的電影和連續劇,基於Netflix推薦引擎的現有案例,本研究使用標題分群技術針對從平台中提取的數據建構模型。研究中量化了電影特徵,並應用 K 均值分群演算法離型 K 原型分群演算法來處理混合類型數據的分群問題。論文中除了討論分群模型的結果,並從影音串流平台推薦系統的角度討論了它們的實際應用。


    Rapidly developing online video streaming industry is becoming one of the major sectors in the entertainment media market. Leading online streaming platforms tend to amplify their own popularity and customer satisfaction by applying various technology-intensive services based on the sophisticated machine learning techniques. One of the prime products of those applications that provides a tremendous commercial value is the motion pictures recommendations system (RS). Along with its expeditious development RS tends to include more machine learning methods applied for different fields. However clustering remains one of the major techniques required for video streaming RS. Following study aims to discuss clustering applications for movies and series listed in the one of video streaming leaders and pioneers - Netflix. Based on the existing case of Netflix recommendation engine and using data extracted from the platform, this research discusses the problematic of titles clustering. The study covers the process of formalization and quantifying of the motion pictures characteristics, proposes an application of K-means clustering algorithm for numeric data points and separated K-prototypes clustering (Huang, 1998) for mixed type of data. The result of the paper contains several cluster allocations and discusses their practical applications from the perspective of a video streaming platform RS.

    摘要 2 ABSTRACT 3 TABLE OF CONTENTS 4 LIST OF TABLES 5 LIST OF FIGURES 6 Introduction 7 1.1 Research Background 7 1.2 Research objectives 12 1.3 Scope and Assumptions 14 2. Literature review 16 3. Methodology 22 3.1 Framework of the Proposed Analysis 22 3.3 Data preparation & Exploration 31 3.4 Clustering 32 3. 5 Research Tools and Software 37 4. Results and discussion 39 4.1 Data exploration 39 4.2 Features engineering 45 4. 3 Aggregated statistics and datasets merge 49 4. 4 K-means clustering 56 4.4 K-prototypes clustering 66 5. Conclusion 77 5.1 Clustering for RS: Key Takeaways 77 5.2 Managerial Applications and Commercial Value 77 5. 3 Further Research 78 References 81 Appendix 86

    Appriliant, A., 2021, The k-prototype as Clustering Algorithm for Mixed Data Type (Categorical and Numerical),
    https://towardsdatascience.com/the-k-prototype-as-clustering-algorithm-for-mixed-data-type-categorical-and-numerical-fe7c50538ebb
    Baghbani, M., & Monsefi, R., 2007, A New Hybrid Recommender System Using Dynamic Fuzzy Clustering
    Bansal, S., 2021, Netflix Movies and TV Shows, https://www.kaggle.com/shivamb/netflix-shows (Online Accessed: April 30, 2021)
    Barrabi, T., 2020, Who founded Netflix? https://www.foxbusiness.com/markets/how-was-netflix-founded-history
    Berg, D., Chirravuri, R., Cledat, R., Goyal. S., Hamad, F., & Tuulos, V., 2019, Open-Sourcing Metaflow, a Human-Centric Framework for Data Science
    Bhatia, R., 2020, Movies on Netflix, Prime Video, Hulu and Disney, https://www.kaggle.com/ruchi798/movies-on-netflix-prime-video-hulu-and-disney (Online Accessed: April 30, 2021)
    Chong, D., 2020, Deep Dive into Netflix’s Recommender System, https://towardsdatascience.com/deep-dive-into-netflixs-recommender-system-341806ae3b48
    Comparably, Netflix Data Scientist Salary https://www.comparably.com/companies/netflix/salaries/data-scientist (Online Accessed: May 10, 2021)
    Conviva, 2021, Conviva’s State of Streaming Q4 2020, https://www.conviva.com/research/convivas-state-of-streaming-q4-2020/
    Curry, D., 2021, Video Streaming App Revenue and Usage Statistics (2021) https://www.businessofapps.com/data/video-streaming-app-market/
    Dakhel, G., & Mahdavi, M., 2011, A new collaborative filtering algorithm using K-means clustering and neighbors' voting
    Gong, S., 2010, A Collaborative Filtering Recommendation Algorithm Based on User Clustering and Item Clustering, Journal of Software, Vol. 5 pp. 745-752
    Gruenwedel, E., 2021, Report: TV Viewing Declining in 2021, https://www.mediaplaynews.com/report-tv-viewing-to-decline-in-2021/
    He, Z., 2021, Approximation Algorithms for K-Modes Clustering
    Huang, Zh., 1998, Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
    IMBDb, Press Pull Quotes About IMDb’s Traffic, Reach and Industry Importance, https://www.imdb.com/pressroom/press-pull-quotes/ (Online Accessed: June 25, 2021)
    IMDb, IMDb Statistics As of Jun 2021, https://www.imdb.com/pressroom/stats/ (Online Accessed: June 25, 2021)
    K-prototypes, k-prototypes documentation https://kprototypes.readthedocs.io/en/latest/ (Last Accessed: May 11)
    Kużelewska, U., 2014, Clustering Algorithms in Hybrid Recommender System on MovieLens Data
    Matplotlib, Introduction, https://matplotlib.org/2.0.2/index.html# (Online Accessed: June 7, 2021)
    Netflix, Analytics: Driving insights from data, https://research.netflix.com/research-area/analytics (Online Accessed: April 10, 2021)
    Netflix, Complete List of Netflix Originals, https://www.whats-on-netflix.com/originals/ (Online Accessed: April 20, 2021).
    Netflix, The Netflix Prize Rules, https://www.netflixprize.com/rules.html (Online Accessed: May 25, 2021)
    Nguyen, N., 2018, Netflix Wants To Change The Way You Chill, https://www.buzzfeednews.com/article/nicolenguyen/netflix-recommendation-algorithm-explained-binge-watching
    Numpy, ABOUT US, https://numpy.org/about/ (Online Accessed: June 7, 2021)
    O'Rourke, P., 2016, How Netflix’s recommendation system knows exactly what you want to watch, https://mobilesyrup.com/2016/04/06/how-netflixs-recommendation-system-knows-exactly-what-you-want-to-watch/
    Pandas, About pandas, https://pandas.pydata.org/about/index.html (Online Accessed: June 7, 2021)
    Pitsilis, G., Zhang, X., & Wang, W., 2021, Clustering Recommenders in Collaborative Filtering Using Explicit Trust Information
    Roettgers, J,, 2014, Netflix spends $150 million on content recommendations every year, https://gigaom.com/2014/10/09/netflix-spends-150-million-on-content-recommendations-every-year/
    Satopaa, V., Albrecht, J., Irwin, D., Raghavan, B., & College, W., 2021, Finding a “Needle” in a Haystack: Detecting Knee Points in System Behavior
    Scikit-Learn, Compute the mean Silhouette Coefficient of all sampleshttps://scikit-learn.org/stable/modules/generated/sklearn.metrics.silhouette_score.html (Online Accessed: April 5, 2021)
    Scikit-Learn, K-Means clustering https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html (Online Accessed: April 5, 2021)
    Shaw, L. 2018, Advertisers Tuning Out TV in Sign of Trouble for Media Companies, https://www.bloomberg.com/news/articles/2018-02-14/advertisers-tuning-out-tv-in-sign-of-trouble-for-media-companies
    Smith, J., Indig, J., & Siddiqi F., 2019, Open-sourcing Polynote: an IDE-inspired polyglot notebook
    Sneha, Y., & Mahadevan, G. 2011, A Study on Clustering Techniques in Recommender Systems
    Souabi, S. ,Retbi, A., Khalidi, & M., Bennani, S. 2021, A Recommendation Approach in Social Learning Based on K-Means Clustering, Advances in Science, Technology and Engineering Systems Journal, Vol. 6, pp. 719-725
    Technavio, 2021, $ 149.96 Billion growth expected in Online Streaming Services Market, https://www.prnewswire.com/news-releases/-149-96-billion-growth-expected-in-online-streaming-services-market--14-38-yoy-growth-in-2020-amid-covid-19-spread--north-america-to-notice-maximum-growth--technavio-301283480.html
    The Python Package Index, kmodes Description https://pypi.org/project/kmodes/ (Online Accessed: April 3, 2021)
    The Python Package Index, Knee-point detection in Python https://pypi.org/project/kneed/ (Online Accessed: May 5, 2021)
    Wallach, O., 2021, Which Streaming Service Has the Most Subscriptions?, https://www.visualcapitalist.com/which-streaming-service-has-the-most-subscriptions/
    Wasid, M., & Rashid, A., 2018, An Improved Recommender System based on Multi-criteria Clustering Approach, Procedia Computer Science, Vol. 131, pp. 93-101

    QR CODE