簡易檢索 / 詳目顯示

研究生: 林志隆
Chih-long Lin
論文名稱: 基於多特徵之視訊內容檢索
Content-based Video Retrieval with Multi Features
指導教授: 陳建中
Jiann-Jone Chen
口試委員: 唐政元
none
何瑁鎧
none
吳怡樂
Yi-Leh Wu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2014
畢業學年度: 102
語文別: 中文
論文頁數: 78
中文關鍵詞: 視訊內容檢索影像多特徵提取資料庫檢索
外文關鍵詞: content-based video retrieval, multi-feature
相關次數: 點閱:161下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近來多媒體編碼與通信技術的進步,以及網際網路的普及應用,多媒體通信已經成為訊息傳遞的主要媒介之一,視訊與影像的資料透過網路傳輸形成媒體大海,因此如何有效的從媒體大海中找到與使用者興趣相符合的視訊資料,是當今多媒體資料庫一個重要的關鍵技術。視訊內容檢索(video content retrieval)主要的目的就是讓使用者可以從龐大視訊資料庫中,快速且準確的搜尋到所感興趣的視訊片段(video clip)。目前視訊內容檢索的相關研究中仍以擷取畫面特徵為檢索相似度量測的依據,然而在描述視訊中多張畫面內容時可採用多種特徵描述型態,若只採用單一特徵往往無法有效且正確的描述畫面特性。因此如何有效整合多類特徵型態以有效描述多張畫面的資訊內容,是一項富挑戰性的研究。本研究提出一個結合色彩(color),紋理(texture)以及SIFT-BOW(Bag of Word)特徵以描述多張畫面之視訊內容檢索方法。此三種影像特徵不僅能正確有效的描述畫面中之全域特性,同時也描述畫面中局部區域之特徵。在本系統實驗中,我們比較各連續畫面間之色彩特徵差值,將視訊影片分段,接著統計視訊片段(video clip)內各畫面之描述特徵,再平均該片段中每張畫面之特徵,以做為該視訊片段之代表特徵。進行檢索時,我們藉由比較視訊片段之代表特徵與檢索畫面特徵間之相似程度,可得知該視訊片段與檢索畫面的相關性。為評估本研究所提出之視訊內容檢索方法之準確性,實驗中我們分析所提出之整合多種特徵之視訊檢索方法,以及運用單一特徵描述畫面之視訊內容檢索方法,另外並實作Y. Deng[10]所提出基於多特徵描述畫面之視訊內容檢索方法作為比較。實驗結果證實,本研究所提出之多特徵視訊內容檢索系統架構確實有較佳的表現。


    With the advance of multimedia codec technology and communications, multimedia communications become one of the major information media with the aides of internet prevalence. Under this circumstance, image and video data over the Internet contribute to the sea of media and how to search user desired media contents from the sea of media becomes important. Content-Based Video Retrieval (CBVR) methods have been proposed to search user interested video clips, precisely and quickly. Among these researches, extracting image features for similarity measurement is widely adopted. However, adopting only one kind of feature to describe video contents cannot provide satisfactory retrieval results. In general, more than one kind of image/video features are extracted to for efficient video retrieval. How to efficiently integrate different kinds of image/video features is critical and challenging in improving the video retrieval performance. In this thesis, we proposed to integrate color, texture and SIFT-BOW (Bag of Word) image features to describe one video clip. These three features not only can describe the global image feature, but also local region ones. In our experiments, the color histogram difference is used to measure similarity for video scene cut. These video scene cuts, video clips, are used as the basic media unit for description and retrieval. The average of image features within one media unit is used as the representative feature for the video clip. To perform retrieval, the feature of one query image/video is extracted and its similarity to each representative feature of one video clip in a database is calculated to perform similarity ranking. For comparisons, the video retrieval performance that adopts only one feature is implemented. In addition, the one proposed by Y. Deng [10] that adopts more than one feature for video retrieval is also carried out for comparisons. Experiments showed that the proposed CBVR method outperforms the previous method by 38.7% in the PR rate. Performing CBVR by multiple features also improves on the PR performance as compared to retrieval by single feature.

    摘要I ABSTRACTII 誌謝III 目錄IV 圖目錄VII 第一章 緒論1 1.1研究背景與動機1 1.2研究項目與方法概述2 1.3論文架構3 第二章 背景知識與相關技術探討4 2.1色彩特徵模型分析4 2.1.1RGB 色彩模型4 2.1.2YUV 色彩模型5 2.1.3HSV 色彩模型7 2.2紋理描述直方圖8 2.2.1MPEG-7 EHD影像分割9 2.2.2MPEG-7 EHD 邊緣偵測10 2.2.3MPEG-7 EHD 統計邊緣分佈直方圖11 2.3SIFT--BAG OF WORD 演算法11 2.3.1尺度不變特徵轉換SIFT(Scale-invariant feature transform)11 2.3.1.1尺度空間中的極值偵測13 2.3.1.2特徵點位置定位與校正15 2.3.1.3特徵點方向之分配16 2.3.1.4特徵點描述向量16 2.3.2K-MEANS 分群法17 2.3.3建構影像之Bag of Word(BOW)18 第三章 多特徵視訊場景檢索技術23 3.1輸入視訊場景25 3.2多特徵描述29 3.2.1色彩以及紋理特徵29 3.2.2SIFT—BOW特徵32 3.3多特徵整合及相似度排序36 3.4多特徵視訊畫面檢索系統總結39 第四章 實驗結果42 4.1實驗環境設置42 4.1.1視訊畫面資料庫42 4.1.2實驗評估準則43 4.2實驗比較對象44 4.2.1RGB Color Historgram44 4.2.2Motion Features45 4.2.3Gabor Texture Feature47 4.3檢索成效評估47 4.3.1多特徵視訊畫面檢索成效展現47 4.3.2與實驗對象比較之數據59 4.3.3實驗結果討論70 第五章 結論及未來展望71 附錄72 參考文獻77

    [1]H. Freeman, “On the encoding of arbitrary geometric configurations,” IRE Transaction on Electronic Computing, vol. 10, no. 2, pp. 260-268, Aug. 1961.
    [2]B. Gunsel and A. M. Tekalp, “Shape similarity matching for query-by-example,” Pattern Recognition, vol. 31, no. 7, pp. 931-944, 1998.
    [3]P. Pala and S. Santini, “Image retrieval by shape and texture,” Pattern Recognition, vol. 32, pp. 517-527, 1999.
    [4]H. Nishida, “Structural feature indexing for retrieval of partially visibleshapes,” Pattern Recognition, vol. 35, no.1, pp. 55-67, 2002.
    [5]S. J. Park, D. K. Park, and C. S. Won, “Core experiments on MPEG-7 edge histogram descriptor,” MPEG document M5984, Geneva, May 2000.
    [6]D. K. Park, Y. S. Jeon, and C. S. Won, “Efficient use of local edge histogram descriptor,” in Proc. ACM Multimedia, pp. 51-54, 2000.
    [7]T. Lindeberg, “Scale-space theory: a basic tool for analyzing structures at different scales,” Journal of Applied Statistics, vol. 21, no. 1, pp. 225-270, 1994.
    [8]D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, Nov. 2004.
    [9]J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proc. of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281-297, 1967.
    [10]Y. Deng and B. S. Manjunath, “Content-based search of video using color, texture, and motion,” in Proc. IEEE Conf. Image Process., vol. 2, pp. 534-537, Oct. 1997.
    [11]W. Yanjun and F. Ruixia, “Video retrieval based on combined color and texture feature in MPEG-7,” International Symposium on Photoelectronic Detection and Imaging, 2007.
    [12]J. Hafner, H. S. Sawhney, W. Equitz, M. Flickner, and W. Niblack, “Efficient color histogram indexing for quadratic form distance functions,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 17, no. 7, pp. 729-736, 1995.
    [13]L. Yang, K. Yu, J. Li, and S. Li, “An effective variable block-size early termination algorithm for H.264 video coding,” IEEE Trans. on Circuits and Syst. for Video Technol., vol. 15, no. 6, June 2005.
    [14]Y.-T. Lin, B.-J. Yen, C.-H. Chang, H.-F. Yang, and G. C. Lee, “Indexing and teaching focus mining of lecture videos,” IEEE Int. Symp. Multimedia, pp. 681-686, Dec. 2009.
    [15]K. J. Jhang, “Vision-based global localization using scale invariant keypoints,” M.S. thesis, Dept. Comp. Sci. & Info. Eng., Tainan Univ., Tainan, Taiwan, 2009.
    [16]Wawayu. (2012, Nov. 23). SIFT 演算法的應用-目標識別之 Bag-of-words模型[Online]. Available: http://blog.csdn.net/v_JULY_v/article/details/6555899
    [17]P. Salembier and T. Sikora. “Texture descriptors,” in Introduction to MPEG-7 Multimedia Content Description Interface, Ed. New York: Wiley, 2002.
    [18]Wikipedia contributors. (2013, Sep. 12). RGB color model [Online]. Available: http://en.wikipedia.org/wiki/RGB_color_model
    [19]Wikipedia contributors. (2013, Sep. 27). YUV [Online]. Available: http://en.wikipedia.org/wiki/YUV
    [20]Wikipedia contributors. (2013, Oct. 1). HSL and HSV [Online]. Available: http://en.wikipedia.org/wiki/HSL_and_HSV
    [21]J.-K. Wu, M. S. Kankanhalli, J.-H. Lim, and D. Hong. “Color Feature Extraction,” in Perspectives on Content-Based Multimedia Systems, Springer US, 2002.
    [22]D.-C. He, L. Wang, J. Guibert. “Texture feature extraction,” in Pattern Recognition Letters, Volume 6, Issue 4, Pages 269–273, Sep 1987.
    [23]Y. Mingqiang, K. Kidiyo, and R. Joseph, “A Survey of Shape Feature Extraction
    Techniques,” on Pattern Recognition Techniques, Technology and Applications, Vienna: i-Tech, pp.626, 2008
    [24]H.-H. Wang, D. Mohamad, N.A. Ismail. “Semantic Gap in CBIR: Automatic Object Spatial relationships Semantic Extraction and Representation”. International Journal Of Image Processing (IJIP), vol. 4 : Issue.3, 2010
    [25]Q.-Y. Li, H. Hu, Z. Shi, “Semantic Feature Extraction Using Genetic Programming in Image Retrieval,” Proc of the 17th International Conference on Pattern Recognition, 2004.

    QR CODE