簡易檢索 / 詳目顯示

研究生: 吳哲逸
Che-Yi Wu
論文名稱: 融合深度學習及壓縮域特徵之影像檢索技術
Image Retrieval with Fusion Descriptor from Deep Learning and Compressed Domain Features
指導教授: 郭景明
Jing-Ming Guo
口試委員: 郭天穎
Tien-Ying Kuo
沈中安
Chung-An Shen
蔡奇謚
Chi-Yi Tsai
夏至賢
Chih-Hsien Hsia
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2016
畢業學年度: 104
語文別: 中文
論文頁數: 112
中文關鍵詞: 基於內容的影像檢索深度學習卷積神經網路半色調區塊截斷編碼
外文關鍵詞: Content-Based Image Retrieval, Deep Learning, Convolutional-Neural Network, Halftoning, Block Truncation Coding
相關次數: 點閱:412下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文提出一個有效的影像檢索特徵擷取方式,將點擴散區塊截斷編碼的低階特徵和卷積神經網路中的高階特徵結合,透過此方法能有效提升影像檢索準確度。
    低階特徵擷取方式是利用提出的二階層碼簿方式將點擴散區塊截斷編碼的位元圖、色彩最大值、色彩最小值組成有效的特徵。其中二階層碼簿主要是改善原始碼簿造成維度限制的問題。另一方面,高階特徵是利用深度學習中常用的卷積神經網路,其學習出的特徵不僅廣泛應用於不同領域上,也被認為類似於人類的感知特徵。
    本論文提出不同的融合點擴散區塊截斷編碼和卷積神經網路特徵方式,如二階層碼簿、降低維度和相似度正歸化來強化本論文提出的深度二階層碼簿特徵的檢索準確度。效能的評估採用平均準確率和平均再現率來評估融合特徵在不同資料庫上的結果,根據實驗結果,我們所提出的融合特徵與過往文獻比較後可看出其優異的結果。因此,為本論文提出的方法有非常大的潛力可被應用於各種影像檢索相關的應用上。


    This thesis presents an effective image retrieval by combining low-level features from Dot-Diffused Block Truncation Coding (DDBTC) and high-level features from Convolutional Neural Network (CNN) model.
    The low-level features are constructed by the proposed two-layer codebook feature from DDBTC bitmap, maximum, and minimum quantizers. The two-layer codebook is to improve the limited dimension of original codebook. The high-level feature is from CNN which is a very effective approach for deep learning. The high-level feature has been widely applied in recognition and classification, and it is also regarded as close to human perception.
    With the fusion of the DDBTC and CNN features, the extended deep learning two-layer codebook features (DL-TLCF) is generated using the proposed two-layer codebook, dimension reduction, and similarity normalization to improve the overall retrieval rate.
    Two metrics, average precision rate (APR) and average recall rate (ARR), are employed to examine various datasets. As documented in the experimental results, the proposed schemes can achieve superior performance compared to the state-of-the-art methods with either low- or high-level features in terms of the retrieval rate. Thus, it can be a strong candidate for various practical image retrieval related applications.

    中文摘要 I Abstract II 目錄 III 圖表索引 V 誌謝 IX 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究設計與目標 2 1.3 論文架構 4 第二章 卷積神經網路技術文獻探討 5 2.1 前饋式神經網路 6 2.2 倒傳遞式神經網路 8 2.3 影響神經網路效能的因素 10 2.4 卷積神經網路 14 2.4.1 卷積 16 2.4.2 非線性激勵函數 18 2.4.3 匯集 20 2.4.4 訓練方法 21 2.4.5 視覺化過程 23 第三章 基於內容的影像檢索文獻探討 28 3.1 區塊截斷編碼技術探討 29 3.1.1 區塊截斷編碼 30 3.1.2 有序抖動區塊截斷編碼 31 3.1.3 錯誤擴散區塊截斷編碼 32 3.1.4 點擴散區塊截斷編碼 33 3.2 局部二值模式 37 3.3 尺度不變特徵轉換 39 3.4 賈伯濾波器 45 3.5 特徵描述方法 46 3.6 維度降低方法 49 第四章 融合深度學習及壓縮域特徵技術 51 4.1 GoogLeNet全連接高階特徵 57 4.1.1 GoogLeNet架構解析 58 4.1.2 Caffe解說和高階特徵擷取 63 4.2 二階層碼簿轉換點擴散區塊截斷編碼特徵 65 4.2.1 點擴散區塊截斷編碼和碼簿應用 67 4.2.2 二階層碼簿和其碼簿特徵 70 4.3 融合特徵和其應用於局部特徵聚合描述符的特徵 76 4.4 相似度評估和正歸化相似度計算 79 4.5 實驗結果 81 4.5.1 檢索效能評估 81 4.5.2 資料庫影像和測試環境說明 83 4.5.3 融合特徵的效能評估 91 第五章 結論與未來展望 105 參考文獻 106

    [1] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533-536, 1988.
    [2] M. D. Zeiler, and R. Fergus, “Visualizing and Understanding Convolutional Networks,” Computer vision–ECCV, vol. 8689, no. 1, pp. 818-833, Sep. 2014.
    [3] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet Classification with Deep Convolutional Neural Networks,” In NIPS, vol. 2, pp. 1097-1105, Dec. 2012.
    [4] V. Nair, and G. E. Hinton, “Rectified Linear Units Improve Restricted Boltzmann Machines,” ICML, pp. 807-814, Jun. 2010.
    [5] M. D. Zeiler, Krishnan, D., Taylor, G. W., and Fergus, R. “Deconvolutional Networks,” In Proc. IEEE Conf. CVPR, pp. 2528-2535, Jun. 2010.
    [6] Y. Jia. Caffe: An open source convolutional architecture for fast feature embedding, online available on: http://caffe.berkeleyvision.org/, 2013.
    [7] G. Qiu, “Color Image Indexing Using BTC,” IEEE Trans. Image Process., vol. 12, no. 1, Jan. 2003.
    [8] M. R. Gahroudi, and M. R. Sarshar, “Image retrieval based on texture and color method in BTC-VQ compressed domain,” IEEE Int. Symp. SPIA, Feb. 2007.
    [9] F. X. Yu, H. Luo, and Z. M. Lu, “Colour image retrieval using pattern co-occurrence matrices based on BTC and VQ,” Electronics Lett., 20th, vol. 47, no. 2, pp.93-101, Jan. 2011.
    [10] S. Silakari, M. Motwani, and M. Maheshwari, “Color image clustering using block truncation algorithm,” IJCSI, vol. 4, no. 2, pp.31-35, Oct. 2009.
    [11] J. M. Guo, and H. Prasetyo, “Content-based image retrieval using features extracted from halftoning-based block truncation coding,” IEEE Trans. Image Process., vol.24, no.3, pp.1010-1024, Mar. 2015.
    [12] J. M. Guo, H. Prasetyo, and J. H.Chen, “Content-based image retrieval using error diffusion block truncation coding features,” IEEE Trans. CSVT, vol.25, no.3, pp.466-481, Mar. 2015.
    [13] J. M. Guo, H. Prasetyo, and N. J. Wang, “Effective image retrieval system using dot-diffused block truncation coding features,” IEEE Trans. Multimedia, vol. 17, no. 9, pp. 1576-1590, Jun. 2015.
    [14] T. W. Chiang, and T. W. Tsai, “Content-based image retrieval via the multiresolution wavelet features of interest,” J. Inf. Technol. Appl, vol. 1, no. 3, pp. 205-214, Dec. 2006.
    [15] Z. M. Lu, and H. Burkhardt, “Colour image retrieval based on DCT-domain vector quantization index histograms,” Electronics Lett., vol. 41, no. 17, Aug. 2005.
    [16] T. Ojala, M. Pietikäinen, and D. Harwood, “A comparative study of texture measures with classification based on feature distributions,” Pattern Recognit., vol. 29, no. 1, pp. 51-59, Jan. 1996.
    [17] X. Tan, and B. Triggs, “Enhanced local texture feature sets for face recognition under difficult lighting conditions,” IEEE Trans. Image Process., vol. 19, no. 6, pp. 1635-1650, Jun. 2010.
    [18] Z. Guo, et al., “Rotation invariant texture classification using LBP variance with global matching,” Pattern Recognit., vol. 43, no.3, pp. 706-716, Mar. 2010.
    [19] M. Subrahmanyam, R. P. Maheshwari, and R. Balasubramanian, “Local maximum edge binary patterns: A new descriptor for image retrieval and object tracking,” Signal Process., vol. 92, no. 6, pp. 1467-1479, Jun. 2012.
    [20] B. Zhang, Y. Gao, S. Zhao, and J. Liu, “Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor,” IEEE Trans. Image Process., vol. 19, no. 2, pp. 533-544, Feb. 2010.
    [21] S. Murala, R. P. Maheshwari, and R. Balasubramanian, “Local tetra patterns: a new feature descriptor for content-based image retrieval,” IEEE Trans. Image Process., vol. 21, no. 5, pp. 2874-2886, May 2012.
    [22] D. Lowe, “Distinctive image feature from scale-invariant keypoints,” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91-110, Nov. 2004.
    [23] J. Sivic, and A. Zisserman, “Video google: A text retrieval approach to object matching in videos,” In ICCV, pp. 1470-177, 2003.
    [24] I. Elsayad, et al., “A new spatial weighting scheme for bag-of-visual-words,” In CBMI, pp. 1-6, Jun. 2010.
    [25] X. Chen, X. Hu, and Shen, X. “Spatial weighting for bag-of-visual-words and its application in content based image retrieval,” In Proc. Int. Conf. Adv. Knowl. Discovery Data Mining, pp. 27-30, 2009.
    [26] W. Bouachir, M. Kardouchi, and N. Belacel, “Improving bag of visual words image retrieval: A fuzzy weighting scheme for efficient indexation,” In Proc. Int. Conf. Signal-Image Technol. Internet-Based Syst., pp. 215-220, 2009.
    [27] L. Zhu, et al., “Weighting scheme for image retrieval based on bag of visual-words,” IET Image Process, vol. 8, no. 9, pp. 509-518, 2013.
    [28] M. Saadatmand-Tarzjan, and H. A. Moghaddam, “Gabor wavelet correlogram algorithm for image indexing and retrieval,” 18th Intl. Conf. Pattern Recognit., vol. 2, pp. 925-928, 2006.
    [29] C. H. Lin, R. T. Chen, and Y. K. Chan, “A smart content-based image retrieval system based on color and texture feature.” Image and Vision Computing, vol. 27, no. 6, pp. 658–665, May 2009.
    [30] N. Jhanwar, et al., “Content based image retrieval using motif co-occurrence matrix,” Image and Vision Computing, vol. 22, pp. 1211–1220, Dec. 2004.
    [31] P. W. Huang, and S. K. Dai, “Image retrieval by texture similarity,” Pattern Recognit., vol. 36, no. 3, pp. 665–679, Mar. 2003.
    [32] T. C. Lu, and C. C. Chang, “Color image retrieval technique based on color features and image bitmap,” Inf. Process. Manage, vol. 43, no. 2, pp. 461-472, Mar. 2007.
    [33] P., Poursistani, H. Nezamabadi-pour, R. A. Moghadam, and M. Saeed, “Image indexing and retrieval in JPEG compressed domain based on vector quantization,” Math. and Comp. Modeling, vol. 57, no. 5-6, pp. 1005-1017, 2013.
    [34] M. E. ElAlami, “A novel image retrieval model based on the most relevant features,” Knowledge-Based Syst., vol. 24, no. 1, 2011.
    [35] M. Saadatmand-Tarzjan, and H. A. Moghaddam, “A novel evolutionary approach for optimizing content based image retrieval,” IEEE Trans. System, Man, and Cybernetics, vol. 37, no. 1, pp. 139-153, 2007.
    [36] M. Subrahmanyam, R. P. Maheshwari, and R. Balasubramanian, “Expert system design using wavelet and color vocabulary trees for image retrieval,” Expert Systems with Applications, vol. 39, no. 5, pp. 5104-5114, 2012.
    [37] F. Mahmoudi, and J. Shanbehzadeh, “Image retrieval based on shape similarity by edge orientation autocorrelogram,” Pattern Recognit., vol. 36, no. 8, pp. 1725-1736, 2003.
    [38] C. H. Lin, et al., “Fast color-spatial feature based image retrieval methods,” Expert Systems with Applications, vol. 38, no. 9, pp. 11412-11420, 2011.
    [39] G. H. Liu, Z. Y. Li, L. Zhang, and Y. Xu, “Image retrieval based on micro-structure descriptor,” Pattern Recognit., vol. 44, no. 9, pp. 2123-2133, 2011.
    [40] B. S. Manjunath, et al., “Color and texture descriptors,” IEEE Trans. Circuit and Systems for Video Tech., vol. 11, no. 6, pp. 703-715, 2001.
    [41] G. H. Liu, et al., “Image retrieval based on multitexton histogram,” Pattern Recognit., vol. 43, no. 7, pp. 2380-2389, 2010.
    [42] X. Wang, and Z. Wang, “A novel method for image retrieval based on structure element’s descriptor,” Journal of Visual Communication and Image Representation, vol. 24, no. 1, pp. 63-74, 2013.
    [43] S. E. Robertson, and S. Walker, “Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval,” In Proc. 17th Annu. Int. ACM SIGIR Conf. Res. Develop. Inform. Retr, pp. 232-241, 1994.
    [44] H. Jegou, M. Douze, and C. Schmid, “On the burstiness of visual elements,” In Proc. IEEE Conf. CVPR, pp. 1169-1176, Jun. 2009.
    [45] L. Zheng, S. Wang, and Q. Tian, “Lp-Norm IDF for scalable image retrieval,” IEEE Trans. Image Process., vol. 23, no. 8, pp. 3604-3617, Aug. 2014.
    [46] X. Wang, et al., “Contextual weighting for vocabulary tree based image retrieval.” In ICCV, pp. 209-216, 2011.
    [47] H. Jegou, M. Douze, and C. Schmid, “Hamming embedding and weak geometric consistency for large scale image search,” In ECCV, pp. 304-317, 2013.
    [48] Z. Liu, et al., “Contextual hashing for large-scale image search,” IEEE Trans. Image Process., vol. 23, no. 4, pp. 1606-1614, Apr. 2014.
    [49] B. S. Manjunath, and W. Y. Ma, “Texture feature for browsing and retrieval of image data,” IEEE Trans. Pattern Anal. Machine Intel. vol. 18, no. 8, pp. 837-842, 1996.
    [50] M. Kokare, P. K. Biswas, and B. N. Chatterji, “Texture image retrieval using new rotated complex wavelet filters,” IEEE Trans. Systems, Man, Cyber, vol. 33, no. 6, pp. 1168-1178, 2005.
    [51] M. Subrahmanyam, et al., “Modified color motif co-occurrence matrix for image indexing,” Comp. Electrical Eng., vol. 39, no. 3, pp. 762-774, 2013.
    [52] R. Kwitt, and A. Uhl, “Lightweight probabilistic texture retrieval,” IEEE Trans. Image Process., vol. 19, no. 1, pp. 241-253, Jan. 2010.
    [53] N. E. Lasmar, and Y. Berthoumieu, “Gaussian copula multivariate modeling for texture image retrieval using wavelet transforms,” IEEE Trans. Image Process., vol. 23, no. 5, pp. 2246-2261, May. 2014.
    [54] T. Ojala, et al., “Texture discrimination with multidimensional distributions of signed gray-level differences,” Pattern Recognit., vol. 34, no. 3, pp. 727-739, 2001.
    [55] J. Chen, et al., “WLD: A robust local image descriptor.” IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 9, pp. 1705-1720, Sep. 2010.
    [56] A. Satpathy, X. Jiang, and H. L. Eng, “LBP-based edge-texture features for object recognition,” IEEE Trans. Image Process., vol. 23, no. 5, May 2014.
    [57] A. Porebski, N. Vandenbroucke, and L. Macaire, “Haralick feature extraction from LBP images for color texture classification,” Proc. 1st Workshops Image Process. Theory, Tools Appl. (IPTA), pp. 1-8, 2008.
    [58] F. Bianconi, et al., “Rotation-invariant colour texture classification through multilayer CCR.” Pattern Recognit. Lett., vol. 30, no. 8, pp. 765-773, 2009.
    [59] G. Paschos, and M. Petrou, “Histogram ratio features for color texture classification,” Pattern Recognit. Lett., vol. 24, no. 1–3, pp. 309-314, 2003.
    [60] M. A. Hoang, et al., “Color texture measurement and segmentation,” Signal Process., vol. 85, no. 2, pp. 265-275, 2005.
    [61] J. J. Junior, P. C. Cortez, and A. R. Backes, “Color texture classification using shortest paths in graphs,” IEEE Trans. Image Process., vol. 23, no. 9, pp. 3751-3761, Sept. 2014.
    [62] R. Arandjelovic, and A. Zisserman, “All about VLAD,” In Proc IEEE Conf. CVPR, pp. 1578-1585, 2013.
    [63] C. Szegedy, et al., “Going deeper with convolutions,” In Proc IEEE Conf. CVPR, pp. 1-9, 2015.
    [64] K. Simonyan, and A. Zisserman, “Very deep convolutional networks for large-scale image recognition.” arXiv preprint arXiv: 1409.1556, 2014.
    [65] M. Lin, Q. Chen, and S. Yan, “Network in network,” arXiv preprint arXiv: 1312.4400, 2013.
    [66] J. Wan, et al., “Deep learning for content-based image retrieval: A comprehensive study,” In Proc. ACM MM, pp. 157-166, 2014.
    [67] O. Russakoveky, et al., “Imagenet Large Scale Visual Recognition Challenge,” IJCV, 2015.
    [68] Corel Photo Collection Color Image Database, [online] http://wang.ist.psu.edu/.
    [69] SIPI-USC Brodatz texture image database,
    [online] http://sipi.usc.edu/database/database.php?volume=textures.
    [70] MIT-Vision Texture (VisTex) image database,
    [online] http://vismod.media.mit.edu/vismod/imagery/VisionTexture/vistex.html.
    [71] G. J. Burghouts, and J. M. Geusebroek, “Material-specific adaptation of color invariant features,” Pattern Recognit. Lett., vol. 30, pp. 306-313, 2009.
    [72] H. Jegou, M. Douze, and C. Schmid, “Hamming embedding and weak geometric consistency for large scale image search,” In ECCV, pp. 304-317, 2013.
    [73] A. R. Backes, D. Casanova, and O. M. Bruno, “Color texture analysis based on fractal descriptors,” Pattern Recognit., vol. 45, no. 5, pp. 1984-1992, 2012.
    [74] Outex texture image database, [online] http://www.outex.oulu.fi/index.php?page=outex_home.
    [75] KTH-TIPS texture image database, [online]http://www.nada.kth.se/cvap/databases.
    [76] D. Nister, and H. Stewenius, “Scalable recognition with a vocabulary tree,” IEEE Conf. CVPR, vol. 2, pp. 2161-2168, Jun. 2006

    QR CODE