基於內容的視訊摘要及收縮｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	張幸卿 Hsing-Ching Chang
論文名稱：	基於內容的視訊摘要及收縮 Content Based Video Resizing in Temporal and Spatial Domain
指導教授：	楊傳凱 Chuan-Kai Yang
口試委員:	林惠勇 Huei-Yung Lin 莊永裕 Yung-Yu Chuang 陳祝嵩 Chu-Song Chen 鍾國亮 Kuo-Liang Chung
學位類別：	博士 Doctor
系所名稱：	管理學院 - 資訊管理系 Department of Information Management
論文出版年：	2009
畢業學年度：	97
語文別：	英文
論文頁數：	110
中文關鍵詞：	接縫切割、動態規畫、影片切割、調整影片的大小、影片摘要、兩階段動態規畫
外文關鍵詞：	Video Carving, Video Resizing, Video Summarization, Two Stage Dynamic Programming Algorithm
相關次數：	點閱：351 下載：52
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

由於現代科技快速精進，使得錄影設備普及，卻造成影片資料的氾濫。但在大量累積的影片資料中卻只有少量的資料具有保存價值，所以如何從龐大的影片資料中萃取出重要資訊是一個刻不容緩的課題。本篇論文提出一個架構在 Seam-Based上的 Greedy演算法，可有效的摘要出影片資料的重要資訊。本文的目的是想從影片資料中精簡出一部具有最大資訊含量的濃縮影片，且不需要做任何的事前處理。
我們假設影片是一兼具時間性與空間性的3D立方體。我們藉著依序刪除 “不重要2D曲面(Seam Surface)”以達到縮減影片的目的，而此 “不重要2D曲面”的擷取除了需遵守 ”連續性”(connectivity) 及 “單一性”( monotonicity )的限制外，我們提出應再加入 ”區域性”( locality) 的限制。如此可抑制一些因影片縮減所造成的缺陷。
根據greedy 演算法只追求區域性最佳解的特性，即可解決Graph Cut 方法所演繹出的兩個問題：記憶體需求量及計算效率。我們也提出用 Sobel Operator 來改善影片縮減後品質的方式。
我們也將此演算法推廣用以改變影片的大小 (即長、寬) 以適應大小不同的影片播放設備。
最後，將我們的結果與修改後的 ”two stage dynamic programming algorithm ”做比較，發現雖然我們演算法找出的 Seam Surface cost比”two stage dynamic programming algorithm ”的大，但我們結果的quality卻比較好，由此更可證明我們方法的有效性。

With the rapid adoption of video capture devices and the increase of video data, the need for the effective and efficient extraction of essential video content from the vastly amount of data has never become so imperative.
In this thesis we present a new approach for video summarization using a seam-based approach by applying a greedy algorithm. Our goal is to compact the video as dense as possible to obtain an informative video volume without any pre-knowledge (such as saliency maps).
In our method, we successively remove the non-activity seam surfaces to achieve our goal. In addition to the two constraints of connectivity and monotonicity, we introduce an additional constraint, locality, which could potentially reduce the generation of visual artifacts due to video summarization.
This greedy strategy always looks for the local minimum, and therefore it overcomes the memory restriction and computation limitations faced in other approaches, such as the ones using graph-cut technique.
We also use the Sobel operator to improve the quality of the output. In addition, we extend our technique to the spatial domain for the purpose of video retargeting.
We also modify the “two stage dynamic programming algorithm” to search a 3D approach for seam carving. We demonstrate that despite our method is not configured to produce a global minimum state, our results are of a better quality and comparatively free of visual artifacts.

摘要		I
ABSTRACT		II
誌謝		III
TABLE OF CONTENTS	IV
LIST OF TABLES		VII
LIST OF FIGURES		VIII
Chapter 1.	Introduction	1
1.	Background and Motivation	1
2.	Contributions	6
3.	Organization	7
Chapter 2.	Related Works	8
1.	Temporal Resizing	8
1.1.	Frame-Based Approaches	8
1.2.	Object-Based Approaches	9
1.3.	Seam- Removal Approaches	9
1.4.	Other Approaches	11
2.	Spatial Resizing (Retargeting)	11
2.1.	Scaling and Cropping Approaches	11
2.2.	Image warping Approaches	12
2.3.	Seam Carving Approaches	12
2.4.	Multi-Operator Approaches	13
3.	Two Stage Dynamic Programming	14
Chapter 3.	The Difficulties Faced in Seam Carving for a Video	15
1.	Optimal Seam	15
2.	Desired Algorithm	19
Chapter 4.	Energy	22
1.	2D Gradient	22
2.	3D Gradient	26
Chapter 5.	Temporal Resizing	28
1.	Seam Carving Using the Greedy Method for 2D Images	29
1.1.	Finding the Seed of Seam	30
1.2.	Growing the Seam from a Seed	30
1.3.	Finding Min seam	31
1.4.	Results	35
2.	Seam Carving Using Greedy Method for 3D Video	40
2.1.	Finding the Seed (Start Point) of Seam Surface	44
2.2.	Growing the Surface from a Seam	44
2.3.	Finding Min Seam Surface	47
Chapter 6.	Spatial Resizing	48
1.	Preparing Energy	50
2.	Growing Seam Surface	50
Chapter 7.	Seam Carving Using Two-Stage Dynamic Programming	51
1.	Accumulating the cost in the vertical direction for each slice	52
2.	Obtaining the Seam to Develop a Seam Surface	54
Chapter 8.	Results and Discussion	56
1.	Temporal Resizing	56
1.1.	Results	56
1.2.	Comparisons	62
1.2.1.	Comparisons with Fast Forward Approach	62
1.2.2.	Comparisons with Synopsis	64
1.2.3.	Comparisons with Video carving	69
1.3.	Limitations	73
2.	Spatial Resizing	75
2.1.	Results and Comparisons	75
2.2.	Limitation	81
3.	Two-Stage Dynamic Programming	83
3.1.	Results and Comparisons	83
Chapter 9.	Conclusions and Future Works	91
1.	Conclusions	91
2.	Future Works	91
2.1.	Shaking Correction	91
2.2.	Moving Scene	92
2.3.	Multi-Operator Video Resizing	92
2.4.	Taking the audio into account	93
References		94
About the Author		99
List of Publications		100

                                

[1] H. Kang, Y. Matsushita, X. Tang, and X. Chen, “Space-Time Video Montage,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), pp. 1331–1338, June 2006.
[2] Y. Pritch, A. Rav-Acha, and S. Peleg, “Nonchronological Video Synopsis and Indexing,” IEEE Trans. PAMI, Vol 30, No. 11, pp. 1971-1984, Nov. 2008.
[3] B. CHEN, and P. SEN, ”Video Carving,” In Short Papers Proceedings of Eurographics, 2008.
[4] S. Avidan, and A. Shamir, “Seam Carving for Content Aware Image Resizing,” ACM Trans. Graph. 26, 3, 10, 2007
[5] M.Rubinstein, A. Shamir, and S. Avidan, “Improved Seam Carving for Video Retargeting,” ACM Trans. Graph. 27, 3, 2008
[6] R.C.T.Lee, R.C.T.Chang, S.S.Tseng, Y.T.Tsai, “Introduction to the Design and Analysis of Algorithms,” Second Edition, 旗標(FLAG)
[7] Changmin Sun, “Fast Stereo Matching Using Rectangular Subregioning and 3D Maximum-Surface Techniques,” International Journal of Computer Vision. No.1/2/3, pp.99-117, May 2002
[8] Rafael C. Gonzalez, Richard E. Woods, “Digital Image Processing” p.135, Prentice-Hall, Inc., 2002
[9] Robert ”Machine Perception of Three-Dimensional Solids.” In Optical and Electro-Optical Information Processing, Tippet,J.T.(ed), MIT Press, Cambridge, Mass, 1965
[10] J. Lopez, M. Markel, N. Siddiqi, G. Gebert and J. Evers, “Performance
of Passive Ranging from Image Flow,” IEEE International Conference on
Image Processing, vol. 1, pp. 929-932, 2003
[11] Nan Lu, Jihong Wang, Q.H. Wu and Li Yang, “An Improved Motion Detection Method for Real-Time Surveillance”, IAENG International Journal of Computer Science, 35:1, IJCS_35_1_16, 2008
[12] H.J Zhang, et al., “An Integrated System for Content-based Video Retrieval and Browsing”, Pattern Recognition, vol.30, no.4, pp.643-658, 1997
[13] W. Wolf, “Key-Frame Selection by Motion Analysis”, In Proceeding of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol.2, pp.1228-1231, Atlanta, Georgia, U.S., May 7-10, 1996.
[14] Z.Zhu, X.Wu, J.Fan, and W. G. Aref A. K. E., ”Exploring Video Content Structure for Hierarchical Summarization,” Multimedia Systems 10, 2, 98–115, 2004
[15] W. C. Ngo, H. J. Zhang, R. T. Chin, T. C. Pong, “Motion-Based Video Representation for Scene Change Detection,” In ICPR, 2000 .
[16] D. Farin, W. Effelsberg, and Peter H. N. de, ”Robust Clustering-Based Video Summarization with Integration of Domain-Knowledge,” In ICME, pp. 89–92. 2002.
[17] A. Divakaran, K. A. Peker, R. Radhakrishnan, Z. Xiong, R.Cabasson, ”Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors,” Tech. Rep. TR2003-34, MERL – Mitsubishi Electric Research Laboratories, 2003.
[18] J. Namj, A. H.Tewfik, ”Video Abstract of Video,” In The 3rd IEEE Workshop on Multimedia Signal Processing, pp. 117–122, 1999.
[19] E. P. Bennett, L. Mcmillan “Computational Time-Lapse Video,” In SIGGRAPH, 2007..
[20] N. Petrovic, N. Jojic, T.S. Huang, ”Adaptive Video Fast Forward,” Multimedia Tools and Applications 26, 3, 327–344, 2005.
[21] N. Jojic, N. Petrovic, T. S. Huang, ” Scene Generative Models for Adaptive Video Fast Forward”, In ICIP, pp. 619–622 2003.
[22] Ba Tu Truong and Svetha Venkatesh, “Video Abstraction : A Systematic Review and Classification,” TOMCCAP 2006. The tentative schedule is Vol. 3 No. 1, 2007. descriptors. Tech. rep., Mitsubishi Electric Research Laboratory, 2003
[23] Y. Li, T. Zhang, and D. Tretter, ” An Overview of Video Abstraction Techniques,” Technical Report HPL-2001-191, HP Laboratory, 2001
[24] Y.-F. Ma, X.-S. Hua, L. Lu, and H. Zhang,” A Generic Framework of User Attention Model and Its Application in Video Summarization,” IEEE Transactions on Multimedia, 7(5):907–919, 2005.
[25] A. Rav-acha, Y. Pritch, S. Peleg, ” Making a Long Video Short: Dynamic Video Synopsis,” CVPR (2006)
[26] Y. Pritch, A. Rav-acha, A. Gutman, S. Peleg, “Webcam Synopsis: Peeking Around the World,” ICCV (2007)
[27] J. Sun, W. Zhang, X. Tang, and H. Shum, “Background Cut,” In ECCV, pages 628.641, 2006.
[28] Y. Boykov, V. Kolmogoro, “An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision,” IEEE Transactions on Pattern Analysis and Maching Intelligence 26,9,1124-1137, 2004.
[29] A.Sch¨odl, R. Szeliski, , D. H. Salesin, , and I. Essa, “Video Textures,” Proceedings of SIGGRAPH, 489–498. ISBN 1- 58113-208-5, 2000.
[30] L. Q.Chen, X. Xie, X. Fan, W. Y.MA, , H. J. Zhang, and H. Q. Zhou, ” A Visual Attention Model for Adapting Images on Small Displays,” ACM Multimedia Systems Journal 9, 4, 353–364, 2003.
[31] H.Liu, X. Xie, W.-Y. MA, and H.-J.Zhang, ”Automatic Browsing of Large Pictures on Mobile Devices,” In Proceedings of ACM International Conference on Multimedia, 148–155, 2003.
[32] A. Santella, M. Agrawala, D. Decarlo, D. Salesin, and M. Cohen, ”Gaze-Based Interaction for Semiautomatic Photo Cropping,” In Proceedings of CHI, 771–780, 2006.
[33] B. Suh, H. Ling, B. B. Bederson, and D. W. Jacobs, “Automatic Thumbnail Cropping and Its Effectiveness,” In Proceedings of UIST, ACM, 95–104, 2003.
[34] P.Viola, , and M. J. Jones, ” Robust Real-Time Face Detection,” Int. J. Comput. Vision 57, 2, 137–154, 2004.
[35] L. Itti, C. Koch, , and E. Niebur, ”A Model of Saliency Based Visual Attention for Rapid Scene Analysis”, IEEE Trans. Pattern Anal. Mach. Intell., 20, 11, 1254–1259, 1998.
[36] D. Decarlo, , AND A. Santella, “Stylization and Abstraction of Photographs”, ACM Trans. Graph. 21, 3, 769–77, 2002.
[37] R. Gal, O. Sorkine, and D. Cohen-Or, ”Feature-Aware Texturing,” In Proceedings of Europgraphics Symposium on Rendering, 297–303, 2006.
[38] L. Wolf, M.Guttmann, D. Cchen-Or, “Non-Homogeneous Content-Driven Video-Retargeting,” In ICCV, pp. 1–6 , 2007.
[39] Y. Wang, C. Tai, O. Sorkine, T. Lee, “Optimized Scale-and-Stretch for Image Resizing,” In SIGGRAPH Asia 2008 .
[40] Y. F. Zhang, S. M. Hu, R. R. Martin, ”Shrinkability Maps for Content-Aware Video Resizing,” Computer Graphics Forum 27, 7, 2008.
[41] P. Viola and M. Jones, ” Robust Real-Time Face Detection,” IJCV, 2004.
[42] S.-C. Liu, C.-W. Fu, and S. Chang, “Statistical Change Detection with Moments under Time-Varying Illumination,” IEEE Trans. on Image Processing, 1998
[43] F. Liu, Y. Hu, M. Gleicher, “Discovering panoramas in web videos,” Proc. ACM Multimedia Conference, p.329-38, Oct. 2008.
[44] Weiming Dong1 and Jean-Claude Paul2, “Adaptive Content-Aware Image Resizing” EUROGRAPHICS Forum 28,2, 2009.
[45] Michael Rubinstein, Ariel Shamir, Shai Avidan, ”Multi-Operator Media Retargeting,” ACM Transactions on Graphics, Volume 28, Number 3, SIGGRAPH 2009

簡易檢索 / 詳目顯示

相關論文