簡易檢索 / 詳目顯示

研究生: 馮翊婷
Yi-Ting Feng
論文名稱: Twitch遊戲平台影片之影片摘要
Video Summarization of The Game Videos on The Twitch Platform
指導教授: 楊傳凱
Chuan-Kai Yang
口試委員: 林伯慎
Bo-Shen Lin
孫沛立
Pei-Li Sun
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 83
中文關鍵詞: 評論分析聲音分析表情分析SVM影片摘要
外文關鍵詞: Comment analysis, Sound analysis, Expression analysis, SVM, Video summary
相關次數: 點閱:301下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 上傳與分享影片已成為大眾分享事物的管道,隨著網路的快速發展,人們更可以透過網路直播,將自己視音頻圖像即時傳遞給互聯網上的觀眾收看。然而如此大量的影片內容,使大眾無從下手選擇,藉由影片摘要將影片精彩片段篩選,幫助使用者可以迅速了解影片內容。

    Twitch平台為遊戲軟體影音串流平台(直播平台),直播遊戲類別多樣,觀眾 可以選擇喜愛的遊戲類別進行觀看,實況主在Twitch平台上將遊戲畫面實況給觀 眾,觀眾則即時給予評論與實況主進行互動而遊戲實況的影片長度大多為數小時以上,且影片大部分內容冗長、無趣,精彩片段卻只有少數。本論文分析聊天室
    觀眾的評論及實況主的聲音、表情變化以找出影片摘要,評論部分則分析評論頻率及詞頻關鍵字,聲音部分藉由深度學習擷取人聲片段去除背景音樂干擾,觀察實況主聲音音量變化,而表情部分則利用SVM將實況主的表情辨識成誇飾、開 心、普通(沒表情)、說話,觀察實況主的表情變化,進而將所有因素考量統整成 擷取影片摘要的根據。


    Uploading and sharing videos has become a channel for people to share things. With
    the rapid development of the Internet, people can broadcast their own audio and
    video to viewers on the Internet. However, such a large amount of video content
    makes it impossible for the public to choose. The technology of video summary is
    used to extract the highlights of the video, so that the user can quickly understand
    the content of the video.

    The twitch platform is a game software audio and video streaming platform (live
    platform). The game categories are diverse on the twitch platform. The audience
    can select the favorite game categories for viewing. The streamer can give the game
    livestreams to the audience, and the audience can give feedback to the streamer
    in time. Most of the game livestreams are more than a few hours long, where the
    highlights may not be very long.We analyze the comments of the chat room audience, the voice and expression changes of the streamer. For the comment we analyze the comment frequency and the word frequency to find keywords. For sound we remove the background music by some deep learning technique to capture the vocal, and observe the volume change. And for the expression we use the SVM to recognize the expression of the streamer as exaggerated, happy, neutral (No expression), and speaking. Then we
    observe the change of the expression of the streamer. Finally, We take all results
    into consideration to derive a video summarization.

    第一章 緒論 .............................................................. 1 1.1 研究動機與目的 ............................................................ 1 1.2 論文架構 .................................................................... 1 第二章 文獻探討 ......................................................... 2 2.1 物件偵測 .................................................................... 2 2.2 人臉偵測 .................................................................... 4 2.2.1 OpenCV之人臉偵測 . . . . . . . . . . . . . . . . . . . . . . . 4 2.2.2 Dlib之人臉偵測 . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2.3 Dlib之Landmark提取 . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 影片摘要 .................................................................... 7 2.3.1 文字分析之影片摘要 . . . . . . . . . . . . . . . . . . . . . . . 7 2.3.2 人臉及聲音辨識之影片摘要 . . . . . . . . . . . . . . . . . . . 9 2.3.3 畫面及聲音分析之影片摘要 . . . . . . . . . . . . . . . . . . . 10 第三章 演算法設計與系統實作............................................ 11 3.1 系統流程 .................................................................... 11 3.2 評論分析 .................................................................... 13 3.2.1 twitch平台介紹 . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2.2 評論資料的抓取 . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2.3 篩選影片片段 . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.4 方法一:符合門檻之累積次數 . . . . . . . . . . . . . . . . . . . 20 3.2.5 方法二:片段之平均評論數 . . . . . . . . . . . . . . . . . . . . 21 3.2.6 方法三:熱門關鍵字分析 . . . . . . . . . . . . . . . . . . . . . 22 3.3 實況主表情分析 ............................................................ 26 3.3.1 表情數據集 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.3.2 表情辨識訓練 . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.3.3 裁切視訊區域 . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.3.4 同步OpenCV及影片讀取速度 . . . . . . . . . . . . . . . . . . 35 3.3.5 表情辨識結果 . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.4 實況主聲音分析 ............................................................ 38 3.4.1 聲音數據集 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.4.2 模型架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4.3 聲音去背結果 . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 第四章 結果展示與評估................................................... 42 4.1 系統環境 .................................................................... 42 4.2 自動影片摘要系統.......................................................... 43 4.2.1 介面說明 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.2.2 影片摘要之結果 . . . . . . . . . . . . . . . . . . . . . . . . . . 46 第五章 系統實驗 ......................................................... 50 5.1 實驗一:精彩程度與評論數的正相關性..................................... 50 5.2 實驗二:人為剪輯及系統剪輯之影片對比 .................................. 58 5.3 實驗三:使用者偏好選擇 .................................................... 63 第六章 結論與未來展望................................................... 67 參考文獻.................................................................. 68

    [1] asiagodtone. https://www.twitch.tv/asiagodtonegg3be0.
    [2] cawai. https://www.twitch.tv/cawai0147.
    [3] dinter. https://www.twitch.tv/dinterlolz.
    [4] dollshin. https://www.twitch.tv/dollshin0324.
    [5] experiment1:clip website. https://www.youtube.com/watch?v=V2Iy32V66-Y.
    [6] experiment1:full video website. https://www.twitch.tv/videos/449371280.
    [7] experiment:clip2. https://www.youtube.com/watch?v=FD9y-UKajgs.
    [8] experiment:clip3. https://www.youtube.com/watch?v=9-Pb7cTuzS4.
    [9] faker. https://www.twitch.tv/faker.
    [10] gear. https://www.twitch.tv/gearbaby1010.
    [11] Guodong. https://www.youtube.com/channel/UCzJ_FzSb4feYyQEUDoVktUQ.
    [12] Jieba. https://github.com/anderscui/jieba.NET.
    [13] jinny. https://www.twitch.tv/jinnytty.
    [14] music separation. https://github.com/andabi/music-source-separation.
    [15] sandy. https://www.twitch.tv/sandyhsu.
    [16] toyz. https://www.twitch.tv/toyzttv.
    [17] Xiaojian. https://www.twitch.tv/a541021.
    [18] yulihong. https://www.twitch.tv/yulihong22.
    [19] Jian-Hung Chen Bo-Wen Hsieh, Wei-Lun Chen. Video summarization of times-
    tamp comments videos based on concept of folksonomy. IEEE International
    Conference of Soft Computing and Pattern Recognition (SoCPaR), 2015.68
    [20] Xidao Luan Kaichao Zhang Liang Bai Chen Li, Yuxiang Xie. Automatic movie
    summarization based on the visual-audio features. IEEE International Conference on Computational Science and Engineering, 2015.
    [21] Takuya Furukawa, Hironobu Fujiyoshi, and Akiyuki Nomura. A method for estimating cut-edit points in personal videos. 2010 IEEE International Conference on Multimedia and Expo, pages 364–369, 2010.
    [22] Ali Farhadi Joseph Redmon. Yolo9000: Better, faster, stronger. arXiv, 2016.
    [23] Ross Girshick Ali Farhadi Joseph Redmon, Santosh Divvala. You only look once: Unified, real-time object detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), page 10, 2016.
    [24] Tian Kanade, Cohn and Lucey et al. Comprehensive database for facial expression analysis. 2000 IEEE International Conference on Automatic Face and Gesture Recognition, pages 46–53, 2000.
    [25] Yingbo Li and Bernard M ́erialdo. Multi-video summarization based on ob-
    mmr. 2011 9th International Workshop on Content-Based Multimedia Indexing(CBMI), pages 163–168, 2011.
    [26] Tomokazu Sato Naokazu Yokoya Mayu Otani, Yuta Nakashima. Textual description-based video summarization for video blogs. IEEE International
    Conference on Multimedia and Expo (ICME), 2015.
    [27] Hyunwoo Nam and Chang Dong Yoo. Content adaptive video summarization
    using spatio-temporal features. 2017 IEEE International Conference on Image
    Processing (ICIP), pages 4003–4007, 2017.
    [28] G. Preethi, P. Venkata Krishna, Mohammad S. Obaidat, Vankadara Saritha,and Sumanth Yenduri. Application of deep learning to sentiment analysis for recommender system on cloud. 2017 International Conference on Computer,Information and Telecommunication Systems (CITS), pages 93–97, 2017.
    [29] Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. arXiv,
    2018.69
    [30] J.Maydt R.Lienhart. An extended set of haar-like features for rapid object
    detection. Proceedings. International Conference on Image Processing, 2002.
    [31] Sudeep D. Thepade and Pritam Hilal Patil. Novel visual content summarization
    in videos using keyframe extraction with thepade’s sorted ternary block truncation coding and assorted similarity measures. 2015 International Conference on Communication, Information Computing Technology (ICCICT), pages 1–5,2015.
    [32] Sinnu Susan Thomas, Sumana Gupta, and Venkatesh Subramanian. Smart
    surveillance based on video summarization. 2017 IEEE Region 10 Symposium
    (TENSYMP), pages 1–5, 2017.
    [33] Josephine Sullivan Vahid Kazemi. One millisecond face alignment with an
    ensemble of regression trees. IEEEConference on Computer Vision and Pattern
    Recognition, 2014.
    [34] Alexander C. Loui Wei Jiang, Courtenay Cotton. Automatic consumer video
    summarization by audio and visual analysis. IEEE International Conference
    on Multimedia and Expo, 2011.
    [35] Takahiro Yoshida and Takahiro Hayashi. Otopittan: A music recommendation
    system for making impressive videos. 2016 IEEE International Symposium on
    Multimedia (ISM), pages 395–396, 2016.
    [36] Po-Chuan Lin Chia-Yen Chen Jia-Ching Wang Yuan-Shan Lee, Chia-Yung Hsu.
    Video summarization based on face recognition and speaker verification. IEEE
    Conference on Industrial Electronics and Applications (ICIEA), 2015.

    無法下載圖示 全文公開日期 2024/07/31 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE