研究生: 張宗雅
Zhung-Ya Chang
論文名稱: 基於影劇故事分析之影片摘要
Movie Summary Based on Story Analysis
指導教授: 楊傳凱
Chuan-Kai Yang
口試委員: 林伯慎
Bor-Shen Lin
Yuan-Cheng Lai
學位類別: 碩士
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 67
中文關鍵詞: 影片摘要自然語言人臉辨識語者分割聚類
外文關鍵詞: Video summarization, Natural language, Face recognition, Speaker diarization
相關次數: 點閱:375下載:0
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在觀賞長篇連戲劇或是一部電影續作時,可能會遇到忘記先前劇情的狀況,而且一部電影通常耗時90分鐘,歐美的系列連續劇更是多達數十幾集。藉由影片摘要將影片重要片段篩選,可幫助使用者可以迅速回顧影片內容。



    When watching a long series of dramas or a movie sequel, you may encounter the situation of forgetting the previous plot, and a movie usually takes 90 minutes, and the serials in Europe and the United States have as many as dozens of episodes. Filtering the important parts of the video through the video summary can help users quickly review the content of the video.

    For the above purpose, this paper proposes a movie summarization system. In this system, a movie can be input. There are three different models in the system that can process the movie text, screen recognition and sound analysis respectively. Deep learning and natural language processing methods are combined to realize the movie summary for the semantics of the story.

    In the screen model part, we use the face recognition model and speaker grouping to identify who the speaker is in the current frame, and then combine the corresponding character name and subtitles as the basis for subsidizing the movie summary clip. In the text model part, we first use the abstract dialogue summary model to obtain the inference summary of the pre-processed subtitle dialogue, and then obtain the necessary information (synopsis, main actors) of the film from the IMDb database, etc., and combine the film synopsis and subtitle lines (Subtitles) with the Transformer model to find their semantic relevance to find the most relevant line paragraphs, and then use the time information of subtitles to find the corresponding screen, and finally generate the summary video results.

    中文摘要 III 英文摘要 IV 誌謝 VI 目 錄 VII 圖目錄 IX 表目錄 XI 第一章 緒論 1 1.1 研究動機與目的 1 1.2 論文架構 2 第二章 文獻探討 3 2.1 影片摘要 3 2.2 電影分析 6 2.3 人臉辨識 10 2.4 自然語言處理 12 2.5 語者分割聚類 15 第三章 演算法設計與系統實作 16 3.1 系統流程 16 3.2 影片前處理 18 3.2.1 空白幀 19 3.2.2 三分法構圖 19 3.2.3 鏡頭分割 20 3.2.4 人臉偵測 23 3.3 人臉辨識 25 3.4 語者分割聚類 28 3.5 文字前處理 30 3.5.1 字幕清理 30 3.5.2 IMDb資料的抓取 31 3.6 對話摘要 32 3.7 語意相似度 33 3.7.1 Difflib 33 3.7.2 Transformer模型 34 3.7.3 語意相似度摘要規則 34 第四章 結果展示與評估 36 4.1 系統環境 36 4.2 資料集 37 4.3 對話摘要評估方法 41 4.3.1 ROUGE 41 4.3.2 BERTScore 41 4.4 對話摘要實驗結果 43 4.5 語義相似度實驗結果 48 第五章 結論與未來展望 51 參考文獻 52

