簡易檢索 / 詳目顯示

研究生: 吳美莼
Mei-chun Wu
論文名稱: 基於內容的影片大小調整系統
Content-Aware Video Resizing
指導教授: 鮑興國
Hsing-Kuo Kenneth Pao
楊傳凱
Chuan-kai Yang
口試委員: 項天瑞
Tien-Ruey Hsiang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2008
畢業學年度: 96
語文別: 中文
論文頁數: 113
中文關鍵詞: 影片摘要切縫圖形分割演算法動態執行
外文關鍵詞: video summarization, seam carving, graph-cut algorithm, dynamic programming
相關次數: 點閱:180下載:9
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 現代錄影設備隨手可得,如手機、數位相機、PDA、筆記型電腦、DV等,不論是品質、畫質上的技術都提升很多,因此造成影片資源大量氾濫。影片截取(video abstraction)或影片摘要(video summarization)的重點在於如何快速且有效地從大量沒有處理過的影片資源中取出有用的資料,因此已成為近幾年重要的研究主題。雖然目前已經發展出很多方法,但每種方法都還是有它的限制和缺點。

    由Avidan等人發表的論文“seam carving for content-aware image resizing”得到啟發,Chen等人想出一種方法來完成影片分割(video carving): 即在影片所形成的立體資料(video volume)中,藉由重複找出最小損失(minimal cost)的2D曲面(2D sheet)和移除最小損失的2D曲面,以達成縮減影片時間。然而,Chen等人的方法仍有許多有待改善的空間。

    第一,取出的2D曲面(2D sheet)可能不連續 : 即可能位於不同影片畫面(frame)的2個在空間(spatially)上相鄰的像素,卻在時間上(temporally)不相鄰(連續),因此可能造成影片破碎。第二,使用圖形分割(graph-cut)演算法計算時,會產生冗長的計算時間(computation time)。最後,相對地需要很大的記憶體,結果輸入的影片必須分成好幾段,根據記憶體大小調整每一段長度後才能執行;更糟糕地,如此分段的過程可能會導致較不理想的結果。

    這篇論文主要貢獻在提出一個新的方法,在同一段鏡頭(shot)的影片長度內處理以上提到的2個問題。第一,藉由修改Avidan等人的“seam carving”論文中的動態執行方法(dynamic programming approach),我們可以取出連續的2D曲面而且防止影片破碎。第二,不同於Chen等人使用的圖形分割(graph-cut)演算法,我們的方法較簡單且有效率,能夠執行較簡單的作業但速度加快2個數量級-大約100倍。最後,記憶體的消耗量也大大縮小為大約十分之一。

    為了使我們的研究更完整,我們也提出影片加長的方法,以及改變影片畫面大小的方法,並將結果與已現存的方法做比較。


    The rapid advance on modern technology has significantly enhanced the procedure of video capture, both in quality and performance, thus the inundation of video data. Video abstraction or video summarization, which focuses on how to quickly and efficiently extract useful information from the raw and huge amount of video data, has therefore become a very important research issue in recent years. Numerous approaches have been developed but each has its drawbacks or limitations.

    Inspired by the work of Avidan et al. on seam carving for content-aware image resizing, Chen et al. generalize the idea to perform video carving by repeatedly finding and removing the 2D sheet with the minimal cost from the video volume, thus achieving the temporal reduction of a video. However, there are several issues remained in Chen et al.'s approach, which leaves room for further improvement.

    First, the extracted 2D sheet may not be smooth, that is, two spatially adjacent pixels may sit on two different frames that are temporally adjacent, thus causing periodically fragmented results. Second, the employed graph-cut algorithm entails a lengthy computation time. Finally, the required memory is relatively large, and as a result, the input video has to be broken into subsets so that each subset could be brought into the memory in its entirety and processed accordingly. Worse yet, such a partitioning scheme may lead to undesired sub-optimal results.

    The main contribution of this paper is to propose a new approach that addresses all the mentioned issues in one shot. First, by modifying the dynamic programming approach originally adopted in Avidan et al.'s seam carving work, we could extract smooth 2D sheets and thus avoid fragmented outlook. Second, unlike Chen et al.'s graph-cut algorithm, our approach is much simpler and efficient, and could perform similar tasks but with a speed that is about two orders of magnitude faster. Finally, the memory consumption is also greatly reduced to be one order of magnitude smaller.

    To make our work more complete, we also propose methods to extend the length of a video, as well as to resize a video in spatial domain. Results are shown and compared with existing approaches, if applicable, to demonstrate the effectiveness of our proposed approach.

    第一章 導論(introduction)1 1.1 前言(foreword)1 1.2 研究目的(objective)1 第二章 文獻探討(related work)3 2.1 影片摘要相關研究(video summarization related work)3 2.1.1 以影片畫面為基礎(frame-based)3 2.1.1.1 快轉(fast forward)3 2.1.1.2 關鍵畫面(key frame)5 2.1.1.3 其他相關(other)6 2.1.2 以物件為基礎(object-based)7 2.1.3 以2D面為基礎(surface-based)10 2.2 影片合成相關研究(video synthesis related work)12 2.2.1 以影片畫面為基礎(frame-based)13 2.2.2 以2D面為基礎(surface-based)14 2.3 影片畫面大小調整相關研究(video retargeting related work)14 2.3.1 畫面裁切(cropping)15 2.3.2 以物件為基礎(object-based)16 2.3.3 畫面變形(warping)17 2.3.4 以2D面為基礎(surface-based)17 第三章 影片時間調整(video temporal retargeting)21 3.1 運算子(operator)22 3.2 背景(background)25 3.2.1 2D影像切縫(seam carving)25 3.2.2 最佳能量演算法(optimal energy algorithm)28 3.2.3 動態規劃演算法(dynamic programming)31 3.3 影片時間縮短的方法(our video temporal reduction approach)37 3.3.1 系統流程(flow)37 3.3.2 我們的演算法(our algorithm)39 3.4 影片時間延長(video temporal extension)43 3.4.1 背景(background)44 3.5 影片時間延長的方法(our video temporal extension approach)48 3.5.1 系統流程(flow)48 3.5.2 我們的演算法(our algorithm)50 第四章 影片畫面縮小(video spatial reduction)53 4.1 背景(background)54 4.2 我們的方法(our approach)59 4.2.1 系統流程(flow)59 4.2.2 我們的演算法(our algorithm)61 第五章 實驗結果(Results)64 5.1 實驗環境(results environment)64 5.2 結果比較(results comparision)64 5.2.1 影片時間縮短(temporal reduction)64 5.2.2 影片時間延長(temporal extension)68 5.2.3 影片畫面縮小(spatial reduction)73 第六章 限制與未來展望(limitations and future work)75 6.1 限制(limitations)75 6.2 未來展望(future work)80 參考文獻(references)81 備註(notes)85

    [1]AVIDAN S. , SHAMIR A. : Seam Carving for Content-Aware Image Resizing. In SIGGRAPH '2007 (2007).
    [2]BOYKOV Y. , JOLLY M. P. : Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images. In ICCV '2001 (2001).
    [3] BOYKOV Y. , KOLMOGOROV V. : An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. IEEE Transactions on Pattern Analysis and Machine Intelligence '2004 (2004), pp.1124.1137.
    [4]BENNETT E. P. , MCMILLAN L. : Computational Time-Lapse Video. In SIGGRAPH '2007 (2007).
    [5]CHEN B. , SEN P. : Video Carving. In Eurographics '2008 (2008).
    [6]DIVAKARAN A. , PEKER K. A. , RADHAKRISHNAN R. , XIONG Z. , CABASSON R. : Video Summarization using MPEG-7 Motion Activity and Audio Descriptors. Tech. Rep. TR2003-34, MERL – Mitsubishi Electric Research Laboratories '2003 (2003).
    [7]FARIN D. , EFFELSBERG W. , DE WITH P. H. N. : Robust Clustering-Based Video Summarization with Integration of Domain-Knowledge. In ICME '2002 (2002), pp. 89.92.
    [8]HUA X. , LU L. , ZHANG H. : Ave: Automated Home Video Editing. In ACM Multimedia '2003 (2003), pp. 490.497.
    [9]TAO C., JIA J., AND SUN H. : Active Window Oriented Dynamic Video Retargeting. In ICCV '2007 (2007)
    [10]KIM C. , HWANG J. : An Integrated Scheme for Object-Based Video Abstraction. In ACM Multimedia '2000 (2000), pp. 303.311.
    [11]KANG H. , MATSUSHITA Y. , TANG X. , CHEN X. : Space-Time Video Montage. In CVPR '2006 (2006), pp. 1331.1338.
    [12]LIU F. , GLEICHER M. : Video Retargeting: Automating Pan and Scan. In ACM Multimedia '2006 (2006), pp. 241.250.
    [13]MA Y. , HUA X., LU L. , ZHANG H. : A Generic Framework of User Attention Model and Its Application in Video Summarization. IEEE Transactions on Multimedia '2005 (2005), pp.907.919.
    [14]NAM J. , TEWFIK A. H. : Video Abstract of Video. In The 3rd IEEEWorkshop on Multimedia Signal Processing '1999 (1999), pp. 117.122.
    [15]NGO C. W. , ZHANG H. J. , CHIN R. T. , PONG T. C. : Motion-Based Video Representation for Scene Change Detection. In ICPR '2000 (2000).
    [16]PAN C. , CHUANG Y. , HSU W. H. : Ntu TRECVID-2007 Fast Rushes Summarization System. In Proceedings of the International Workshop on TRECVID Video Summarization '2007(2007) , pp. 74.78.
    [17]PAL C. , JOJIC N. : Interactive Montages of Sprites for Indexing and Summarizing Security Video. In CVPR '2005 (2005), pp. 1192.1192.
    [18]PETROVIC N. , JOJIC N. , HUANG T. S. : Adaptive Video Fast Forward. Multimedia Tools and Applications '2005 (2005) , pp. 327.344.
    [19]PRITCH Y. , RAV-ACHA A. , PELEG S. : Non-Chronological Video Synopsis and Indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence '2008 (2008).
    [20]SMITH M. , KANADE T. : Video Skimming and Characterization through the Combination of Image and Language Understanding Techniques. In CVPR '1997 (1997), pp. 775.781.
    [21]WANG J. , REINDERS M. , LAGENDIJK R . , LINDENBERG J. ,AND KANKANHALLI M. : Video Content Representation Tiny Devices. In ICME '2004 (2004), vol. 3, pp.1711–1714.
    [22]SUKMARG O. , RAO K. : Fast Object Detection and Segmentation in Mpeg Compressed Domain. In IEEE TENCON '2000 (2000).
    [23]SCHODL A. , SZELISKI R. , SALESIN D. H. , ESSA I. : Video Textures. In SIGGRAPH '2000 (2000), pp. 489.498.
    [24]WOLF L. , GUTTMANN M. , COHEN-OR D. : Non-Homogeneous Content-Driven Video-Retargeting. In ICCV '2007 (2007), pp. 1.6.
    [25]ZHU Z., WU X., FAN J., AMD W. G. AREF A. K. E.: Exploring Video Content Structure for Hierarchical Summarization. Multimedia Systems '2004 (2004), pp.98.115.
    [26]AGARWALA A., COLIN ZHENG K., Pal C., AGRAWALA M., COHEN M., CURLESS B., SALESIN D., SZELISKI R. : Panoramic Video Textures. In SIGGRAPH '2005 (2005).
    [27]HERMANS C., VANAKEN C., MERTENS T., VAN REETH F., BEKAERT P., CURLESS B., SALESIN D., SZELISKI R. : Augmented Panoramic Video. In EUROGRAPHICS '2008 (2008).
    [28]KWATRA V., SCH¨ODL A., ESSA I., TURK G., BOBICK A.: Graphcut Textures: Image and Video Synthesis Using Graph Cuts. In SIGGRAPH '2003 (2003).
    [29]LIU F., GLEICHER M.: Automatic Image Retargeting with Fisheye-View Warping. In UIST '2005 (2005).
    [30]SETLUR V., TAKAGI S., RASKAR R., GLEICHER M., GOOCH B. : Automatic Image Retargeting. In Proc. Sym. On Mobile and ubiquitous multimedia table of contents '2005 (2005).
    [31]RUBINSTEIN M., SHAMIR A., AVIDAN S.: Improved Seam Carving for Video Retargeting. In SIGGRAPH '2008 (2008).
    [32]FAN X., XIE X., ZHOU H.-Q., AND MA W.-Y. : Looking into video frames on small displays. In ACM MULTIMEDIA '2003 (2003) , pp.247–250.
    [33]WANG J., XU Y., SHUM H.-Y., AND COHEN M. F. : Video tooning. In ACM Trans. Graph. '2004 (2004), pp. 574–583.
    [34]M. Irani, P. Anandan, J. Bergen, R. Kumar, and S. Hsu. : Efficient representations of video sequences and their applications. In Image Communication ‘1996 (1996) , 8(4): pp.327–351.
    [35]BAR-JOSEPH Z. , EL-YANIV R. , LISCHINSKI D. , AND WERMAN M. : Texture mixing and texture movie synthesis using statistical learning. In IEEE Transactions on Visualization and Computer Graphics '2001 (2001), pp.120–135.
    [36]WEI L.-Y. , AND LEVOY M. : Fast texture synthesis using treestructured vector quantization. In SIGGRAPH '2000 (2000), pp.479–488. ISBN 1-58113-208-5.
    [37]RAV-ACHA A., PRITCH Y., LISCHINSKI D., AND PELEG S. : Dynamosaicing: Mosaicing of dynamic scenes. In PAMI '2007 (2007), pp.1789–1801.
    [38]B. Suh, H. Ling, B. B. Bederson, D. W. Jacobs. : Automatic thumbnail cropping and its effectiveness. In User interface software and technology '2003 (2003), pp. 95–104.
    [39]Yuzhu L., Liangshou W. :3D Segmentation & Visualization of MRI brain images. Digital Image Processing '2005 (2005).

    QR CODE