簡易檢索 / 詳目顯示

研究生: 王宏仁
Hung-Jen Wang
論文名稱: 機器學習與資料探勘的實作應用-預測電影票房在台灣市場的表現
A PRACTICAL APPLICATION ON PREDICTING THE BOX OFFICE PERFORMANCE IN THE TAIWANESE FILM MARKET BASED ON MACHINE LEARNING AND DATA MINING TECHNIQUES
指導教授: 呂志豪
Shih-Hao Lu
口試委員: 曾盛恕
Su-Seng Tsang
張飛黃
鄭仁偉
Jen-Wei Cheng
學位類別: 碩士
Master
系所名稱: 管理學院 - 管理學院MBA
School of Management International (MBA)
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 48
中文關鍵詞: Taiwanese Film MarketMovies, Box Office ForecastingMachine LearningData MiningWorld–of-Mouth (WOM)Random Forest AlgorithmSupport Vector Machine AlgorithmLogistic Regression AlgorithmClassification
外文關鍵詞: Taiwanese Film Market, Movies, Box Office Forecasting, Machine Learning, Data Mining, World–of-Mouth (WOM), Random Forest Algorithm, Support Vector Machine Algorithm, Logistic Regression Algorithm, Classification
相關次數: 點閱:453下載:59
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • A decade of research has provided the film industry useful information regarding pre-production and on-showing forecasting of movie revenues using multiple machine learning algorithms. However, this research is focused on revenue within Hollywood movies and domestic revenues in their own countries, as the total box office revenues in the Taiwanese film market is still largely unknown and uninvestigated. Additionally, there has been little progress in the development of machine learning based forecasting models in the Taiwanese film industry.
    This research had collected the data of total box office revenues as the dependent variable from a published report conducted by the National Taiwan Film Institute. In previous studies surrounding this topic, internal variables, such as the feature extraction from the movie aspect and external variables, including word-of-mouth (WOM) by consumers, were studied closely.
    In this study, a Random Forest algorithm was employed to predict the approximate final box office revenue of a movie in one of six categories during its theatrical period. Additionally, two machine learning algorithms were also applied to compare the performance of the Random forest algorithm, which are the Support Vector Machine algorithm and the Logistic Regression algorithm. The insight of feature importance also provides a different perspective and reference for the decision makers who work in the movie industry to make an appropriate decision.


    A decade of research has provided the film industry useful information regarding pre-production and on-showing forecasting of movie revenues using multiple machine learning algorithms. However, this research is focused on revenue within Hollywood movies and domestic revenues in their own countries, as the total box office revenues in the Taiwanese film market is still largely unknown and uninvestigated. Additionally, there has been little progress in the development of machine learning based forecasting models in the Taiwanese film industry.
    This research had collected the data of total box office revenues as the dependent variable from a published report conducted by the National Taiwan Film Institute. In previous studies surrounding this topic, internal variables, such as the feature extraction from the movie aspect and external variables, including word-of-mouth (WOM) by consumers, were studied closely.
    In this study, a Random Forest algorithm was employed to predict the approximate final box office revenue of a movie in one of six categories during its theatrical period. Additionally, two machine learning algorithms were also applied to compare the performance of the Random forest algorithm, which are the Support Vector Machine algorithm and the Logistic Regression algorithm. The insight of feature importance also provides a different perspective and reference for the decision makers who work in the movie industry to make an appropriate decision.

    ABSTRACT-I ACKNOWLEDGEMENTS-II TABLE OF CONTENTS-III FIGURE INDEX-IV TABLE INDEX-V CHAPTER 1 INTRODUCTION-1 1.1 BACKGROUND-1 1.2 STATEMENT OF PROBLEMS-1 1.3 PURPOSE OF THIS STUDY-2 1.4 THE FLOW CHART OF THIS STUDY-4 CHAPTER 2 LITERATURE REVIEW-5 2.1 BOX-OFFICE FORECASTING STUDIES-5 2.2 EXPLANATORY VARIABLES-8 2.3 PERFORMANCE METRICS-19 CHAPTER 3 METHODOLOGY-21 3.1 RAW DATA COLLECTING-22 3.2 DATA PREPROCESSING-29 3.3 MODEL TRAINING & TESTING-32 3.4 MACHINE LEARNING ALGORITHMS APPLICATION-33 CHAPTER 4 RESULTS-34 4.1 RANDOM FOREST (RF) ALGORITHM PERFORMANCE-34 4.2 COMPARISON TO OTHER MODELS-36 4.3 FEATURE IMPORTANCE ANALYSIS-37 CHAPTER 5 CONCLUSION AND DISCUSSION-40 5.1 CONCLUSION AND DISCUSSION-40 5.2 LIMITATIONS-41 5.3 SUGGESTION FOR FUTURE STUDY-42 5.4 IMPLICATION-42 REFERENCES-44

    Asur, S., & Huberman, B. A. (2010). Predicting the future with social media. Paper presented at the Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology-Volume 01.
    Baek, H., Oh, S., Yang, H.-D., & Ahn, J. (2017). Electronic word-of-mouth, box office revenue and social media. Electronic Commerce Research and Applications, 22, 13-23. doi:10.1016/j.elerap.2017.02.001
    Basuroy, S., Chatterjee, S., & Ravid, S. A. (2003). How critical are critical reviews? The box office effects of film critics, star power, and budgets. Journal of marketing, 67(4), 103-117.
    Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
    Broekhuizen, T. L., Delre, S. A., & Torres, A. (2011). Simulating the Cinema Market: How Cross‐Cultural Differences in Social Influence Explain Box Office Distributions. Journal of Product Innovation Management, 28(2), 204-217.
    Brownlee, J. (2017). How to One Hot Encode Sequence Data in Python. Retrieved from https://machinelearningmastery.com/how-to-one-hot-encode-sequence-data-in-python/
    Chen, H.-R. (2018). Forecasting Movie Box-office with Neural Networks. thesis.
    Conaway, B., & Ellis, D. (2015). Do MPAA Ratings Affect Box Office Revenues? (Vol. I).
    Delen, D., & Sharda, R. (2010). Predicting the financial success of hollywood movies using an information fusion approach. Indus Eng J, 21(1), 30-37.
    Delen, D., Sharda, R., & Kumar, P. (2007). Movie forecast Guru: A Web-based DSS for Hollywood managers. Decision Support Systems, 43(4), 1151-1170. doi:10.1016/j.dss.2005.07.005
    Dhar, T., Sun, G., & Weinberg, C. B. (2012). The long-term box office performance of sequel movies. Marketing Letters, 23(1), 13-29. doi:10.1007/s11002-011-9146-1
    Di, Z., Xiu, J., Lin, J., & Qian, Y. (2016, 17-19 Aug. 2016). Research on movie-box prediction model and algorithm based on neural network. Paper presented at the 2016 4th International Conference on Cloud Computing and Intelligence Systems (CCIS).
    Du, J., Xu, H., & Huang, X. (2014). Box office prediction based on microblog. Expert Systems with Applications, 41(4), 1680-1689. doi:10.1016/j.eswa.2013.08.065
    Duan, W., Gu, B., & Whinston, A. B. (2008a). Do online reviews matter? — An empirical investigation of panel data. Decision Support Systems, 45(4), 1007-1016. doi:https://doi.org/10.1016/j.dss.2008.04.001
    Duan, W., Gu, B., & Whinston, A. B. (2008b). The dynamics of online word-of-mouth and product sales—An empirical investigation of the movie industry. Journal of Retailing, 84(2), 233-242. doi:https://doi.org/10.1016/j.jretai.2008.04.005
    Ghiassi, M., Lio, D., & Moon, B. (2015). Pre-production forecasting of movie revenues with a dynamic artificial neural network. Expert Systems with Applications, 42(6), 3176-3193. doi:10.1016/j.eswa.2014.11.022
    Hur, M., Kang, P., & Cho, S. (2016). Box-office forecasting based on sentiments of movie reviews and Independent subspace method. Information Sciences, 372, 608-624. doi:10.1016/j.ins.2016.08.027
    ink, F. (2016). 電影票房預測入門:讀懂影響票房預測的因素【一文】NO.83. Retrieved from https://read01.com/zh-tw/0GK8eD.html#.W3f5dOgzZPY
    Kim, T., Hong, J., & Kang, P. (2015). Box office forecasting using machine learning algorithms based on SNS data. International Journal of Forecasting, 31(2), 364-390. doi:10.1016/j.ijforecast.2014.05.006
    Lee, K., Park, J., Kim, I., & Choi, Y. (2016). Predicting movie success with machine learning techniques: ways to improve accuracy. Information Systems Frontiers, 20(3), 577-588. doi:10.1007/s10796-016-9689-z
    Litman, B. R. (1983). Predicting success of theatrical movies: An empirical study. The Journal of Popular Culture, 16(4), 159-175.
    Litman, B. R., & Kohl, L. S. (1989). Predicting financial success of motion pictures: The'80s experience. Journal of Media Economics, 2(2), 35-50.
    Liu, Y. (2006). Word of Mouth for Movies: Its Dynamics and Impact on Box Office Revenue. Journal of marketing, 70(3), 74-89. doi:10.1509/jmkg.70.3.74
    Mestyan, M., Yasseri, T., & Kertesz, J. (2013). Early Prediction of Movie Box Office Success Based on Wikipedia Activity Big Data. Plos One, 8(8), 8. doi:10.1371/journal.pone.0071226
    Mishne, G., & Glance, N. S. (2006). Predicting movie sales from blogger sentiment. Paper presented at the AAAI spring symposium: computational approaches to analyzing weblogs.
    Montillo, A. A. (2009). Random forests. Lecture in Statistical Foundations of Data Analysis.
    MPAA. (2018). A comprehensive analysis and survey of the theatrical and home entertainment market enviroment (THEME) for 2017. Retrieved from https://www.mpaa.org/wp-content/uploads/2018/04/MPAA-THEME-Report-2017_Final.pdf
    Panaligan, R., & Chen, A. (2013). Quantifying movie magic with google search. Google Whitepaper—Industry Perspectives+ User Insights.
    Rhee, T. G., & Zulkernine, F. (2016, 18-20 Dec. 2016). Predicting Movie Box Office Profitability: A Neural Network Approach. Paper presented at the 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).
    Sangkil Moon, Paul K. Bergey, & Iacobucci, D. (2010). Dynamic Effects Among Movie Ratings, Movie Revenues, and Viewer Satisfaction. Journal of marketing, 74(1), 108-121. doi:10.1509/jmkg.74.1.108
    Sharda, R., & Delen, D. (2006). Predicting box-office success of motion pictures with neural networks. Expert Systems with Applications, 30(2), 243-254. doi:10.1016/j.eswa.2005.07.018
    Shruti, Roy, S. D., & Zeng, W. (2014, 14-18 July 2014). Influence of social media on performance of movies. Paper presented at the 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).
    Terry, N., Butler, M., & De’Armond, D. A. (2011). The determinants of domestic box office performance in the motion picture industry. Southwestern Economic Review, 32, 137-148.
    Zhang, L., Luo, J., & Yang, S. (2009). Forecasting box office revenue of movies with BP neural network. Expert Systems with Applications, 36(3), 6580-6587.
    Zhang, W., & Skiena, S. (2009, 15-18 Sept. 2009). Improving Movie Gross Prediction through News Analysis. Paper presented at the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

    QR CODE