簡易檢索 / 詳目顯示

研究生: 饒宗翰
Tsung-Han Jao
論文名稱: 使用分離單元之哈希演算法於圖片檢索
Hashing Algorithm with Decomposition Unit for Image Retrieval
指導教授: 陳冠宇
Kuan-Yu Chen
口試委員: 王新民
Hsin-Min Wang
古鴻炎
Hung-Yan Gu
林伯慎
Bor-Shen Lin
陳冠宇
Kuan-Yu Chen
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 111
中文關鍵詞: 圖片檢索機器學習哈希演算法二元碼
外文關鍵詞: Image retrirval, Machine learning, Hash Algorithm, Binary code
相關次數: 點閱:424下載:7
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近幾年來,隨著網路的發達,每天都有成千上萬張新圖片或新影片產生,如何快速檢索出我們需要的資料,儼然成為一個熱議的問題,而此類問題可統稱為「圖片檢索」。圖片檢索的意義在於找出與檢索(Query)相似或相關的圖片,常見的作法是先將圖片轉化為各式特徵向量,再依據特徵向量之間的相似度作為評量準則,最後依序將圖片輸出,作為檢索結果。圖片檢索中一種較有效率的方法是將圖片的特徵向量二值化,我們稱之為「二元碼(Binary Code)」向量,再利用漢明距離(Hamming Distance)來計算兩兩圖片的相似度。其中,各式將圖片轉換為二元碼向量表示法的模型統稱為「哈希演算法」。
      本論文將首先介紹各式經典的哈希演算法,並分別說明各類模型的架構以及優缺點。接著,本論文提出一套新穎的「分離單元」哈希演算法,此一方法著眼於將圖片中的分類資訊特徵保留在二元碼表示法之中,意即盡可能地將影響分類精準度的特徵資訊分離出二元碼空間之中。因此,本研究所提出之分離單元哈希演算法可以得到富含分類資訊的二元碼特徵,進而提升圖片檢索的平均精度均值。值得一提的是,本研究提出之基於分離單元的模型方式可以與各種基於深度學習的哈希演算法相結合。


    Owing to the immenseness of pictures and multimedia data on the internet, finding precise/relevant images or multimedia contents for a given query has become an emergent research area in recent decades. “Image Retrieval” can be viewed as the cornerstone of the school of research. A common strategy is to convert images into a set of feature vectors, and then a similarity function can be leveraged to quantify the similarity degree between a pair of query and candidate image. Consequently, the retrieval results can be obtained by sorting the computed scores. However, such a strategy usually suffers from the time-consuming problem. Therefore, one of the efficient and effective ways is to convert complex image statistics into binary representations, which is so called the “Binary Code”. Based on the binary codes, the Hamming distance can be used to compute the relevance degree between a pair of images. It is obvious that the most important component embedded in the process is the function of binary code, which referred as “Hash Algorithms”.
    Towards the hash algorithms for image retrieval, in this thesis, we make at least two major contributions. First, we introduce the various of hash algorithms, including systematical comparisons as well as theoretical advantages and drawbacks, in detail. Second, we propose a novel framework, which is named “Decomposition Unit”, to improve the performance of the image retrieval task. The concept of the proposal is to encode only the classification-efficient statistics into binary code, while the residual information can be stored in the decomposition unit. By doing so, the resulting binary code can mainly contain the classification-aware information instead of the whole information from the original image so as to enhance the image retrieval performance.

    摘要 i Abstract ii 致謝 iii 目錄 iv 圖目錄 v 表目錄 vi 第1章 緒論 1 1.1 研究動機與目的 1 1.2 論文章節簡介 2 第2章 相關文獻 4 2.1 輸入特徵之選擇 6 2.1.1 邊界特徵 6 2.1.2 全局特徵 9 2.2 相似性保留方法之決策 10 2.2.1 成對輸入(Pairwise) 10 2.2.2 三重輸入(Triplet) 15 2.2.3 成列輸入(Listwise) 19 2.2.4 單點輸入(Pointwise) 27 2.2.5 隱含種類(Implicit) 28 2.3 模型建置 34 2.3.1 卷積神經網路哈希(CNNH) 37 2.3.2 加強式卷積神經網路哈希 (improved CNNH) 39 2.3.3 深度卷積神經網路哈希(DCNNH) 42 2.3.4 加強式深度卷積神經網路哈希(VGGHashing) 44 2.4 損失函數定義 47 2.4.1 相似性保留項 47 2.4.2 平衡性保留項 52 2.4.3 量化項 52 2.4.4 重建項 54 2.5 相似度、距離定義 55 2.6 評分標準 56 2.7 統整 57 第3章 實驗方法 60 3.1 分離單元 62 3.2 模型架構 63 3.3 特徵輸入之選擇 66 3.4 損失項之定義 67 3.5 平衡項損失函式 72 3.6 資料集 73 3.6.1 Cifar-10 [37] 73 3.7 實驗步驟 74 第4章 實驗結果 76 4.1 驗證分離單元功效 76 4.2 加強式深度卷積神經網路哈希中加入分離單元 78 4.3 加強式深度卷積神經網路哈希中加入重建項 79 4.4 測試參數組合 82 4.5 測試不同重建項目標 86 4.6 平衡性損失項 88 4.7 結合分離單元及平衡性損失項 91 4.8 比較其他哈希演算法 93 第5章 結論 95 第6章 參考文獻 97

    [1] M. H. Kiapour, X. Han, S. Lazebnik, A. C. Berg and T. L. Berg, “Where to Buy It: Matching Street Clothing Photos in Online Shops,” IEEE International Conference on Computer Vision (ICCV), pp. 3343-3351, 2015.
    [2] Radenović, Filip & Iscen, Ahmet & Tolias, Giorgos & Avrithis, Yannis & Chum, Ondřej., “Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking”.arXiv:1803.11285.
    [3] P. U. P. N. Moses S. Charikar, “Similarity estimation techniques from rounding algorithms,” STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pp. 380-388, 2002.
    [4] J. He, W. Liu, and S.-F. Chang, “Scalable similarity search with optimized kernel hashing,” the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 1129-1138, 2010.
    [5] Jianqiu Ji, Jianmin Li, Shuicheng Yan, Bo Zhang, Qi Tian, “Super-Bit Locality-Sensitive Hashing,” NIPS'12 Proceedings of the 25th International Conference on Neural Information Processing Systems, p. 108–116, 2012.
    [6] H. Liu, R. Wang, S. Shan and X. Chen, “Deep Supervised Hashing for Fast Image Retrieval,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2064-2072, 2016.
    [7] Jian Zhang, Yuxin Peng, “SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval,” IEEE Transactions on Circuits and Systems for Video Technology, p. 99, 2016.
    [8] Aude Oliva, Antonio Torralba, “Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope,” International Journal of Computer Vision, 2001.
    [9] Karen Simonyan, Andrew Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition”.arXiv:1409.1556.
    [10] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, “Identity Mappings in Deep Residual Networks”.arXiv:1603.05027.
    [11] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, “Going Deeper with Convolutions”.arXiv:1409.4842.
    [12] Yair Weiss, Antonio Torralba, Rob Fergus, “Spectral Hashing,” NIPS'08 Proceedings of the Advances in Neural Information Processing Systems, p. 1753–1760, 2008.
    [13] Fang Zhao, Yongzhen Huang, Liang Wang, Tieniu Tan, “Deep Semantic Ranking Based Hashing for Multi-Label Image Retrieval”.arXiv:1501.06272.
    [14] Jun Wang, Wei Liu, Andy X. Sun, Yu-Gang Jiang, “Learning Hash Codes with Listwise Supervision,” IEEE International Conference on Computer Vision (ICCV), pp. 3032-3039, 2013.
    [15] Mohammad Norouzi, David J. Fleet, Ruslan R. Salakhutdinov, “Hamming Distance Metric Learning”.NIPS'12 Proceedings of the 25th Internnational Conference on Neural Information Processing Systems.
    [16] Fumin Shen, Chunhua Shen, Wei Liu, Heng Tao Shen, “Supervised Discrete Hashing,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 37-45, 2015.
    [17] Go Irie, Zhenguo Li, Xiao-Ming Wu, Shih-Fu Chang, “Locally Linear Hashing for Extracting Non-linear Manifolds,” IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 2123-2130, 2014.
    [18] Rongkai Xia, Yan Pan, Hanjiang Lai, Cong Liu, Shuicheng Yan, “Supervised Hashing for Image Retrieval via Image Representation Learning,” AAAI Conference on Artificial Intelligence, pp. 2156-2162, 2014.
    [19] Hanjiang Lai, Yan Pan, Ye Liu, Shuicheng Yan, “Simultaneous Feature Learning and Hash Coding with Deep Neural Networks,” IEEE International Conference on Pattern Recognition and Computer Vision (CVPR), pp. 3270-3278, 2015.
    [20] Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, Chu-Song Chen, “Deep Learning of Binary Hash Codes for Fast Image Retrieval,” IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 27-35, 2015.
    [21] Tian-qiang Peng, Fang Li, “Image retrieval based on deep Convolutional Neural Networks and binary hashing learning,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.
    [22] Aristides Gionis, Piotr Indyk, Rajeev Motwani, “Similarity Search in High Dimensions via Hashing,” the 25th International Conference on Very Large Data Bases, 1999.
    [23] Brian Kulis, Kristen Grauman, “Kernelized Locality-Sensitive Hashing for Scalable Image Search,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1092-1104, 2012.
    [24] Alexis Joly, Olivier Buisson, “Random maximum margin hashing,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 873–880, 2011.
    [25] Zhongming Jin, Yao Hu, Yue Lin, Debing Zhang, Shiding Lin, Deng Cai, Xuelong Li, “Complementary Projection Hashing,” IEEE International Conference on Computer Vision (ICCV), 2013.
    [26] Krizhevsky Alex, E. Hinton, Geoffrey, “Using very deep autoencoders for content-based image retrieval,” European Symposium on Artificial Neural Networks (ESANN), 2011.
    [27] J. Song, “Binary Generative Adversarial Networks for Image Retrieval”.arXiv:1708.04150.
    [28] Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, “ImageNet classification with deep convolutional neural networks,” NIPS'12 Proceedings of the 25th International Conference on Neural Information Processing System, pp. 1097-1105 , 2012.
    [29] Yunchao Gong, Svetlana Lazebnik, Albert Gordo, Florent Perronnin, “Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 2916-2929, 2012.
    [30] Wei Liu, Jun Wang, Rongrong Ji, Yu-Gang Jiang, Shih-Fu Chang, “Supervised hashing with kernels,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2074-2081, 2012.
    [31] Hemanth Venkateswara,Jose Eusebio, Shayok Chakraborty, Sethuraman Panchanathan, “Deep Hashing Network for Unsupervised Domain Adaptation”.arXiv:1706.07522.
    [32] Thanh-Toan Do, Anh-Dzung Doan, Ngai-Man Cheung, “Learning to Hash with Binary Deep Neural Network,” 16th European Conference on Computer Vision (ECCV), 2016.
    [33] Zhangjie Cao, Mingsheng Long, Jianmin Wang, Philip S. Yu, “HashNet: Deep Learning to Hash by Continuation”.arXiv:1702.00758.
    [34] Qi Li, Zhenan Sun, Ran He, Tieniu Tan, “Deep Supervised Discrete Hashing”.arXiv:1705.10999.
    [35] Xin Luo, Liqiang Nie, Xiangnan He, Ye Wu, Zhen-Duo Chen, Xin-Shun Xu, “Fast Scalable Supervised Hashing,” SIGIR '18 The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 735-744, 2018.
    [36] Hung-Shin Lee , Yu-Ding Lu , Chin-Cheng Hsu , Yu Tsao, “Discriminative autoencoders for speaker verification,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.
    [37] Huei-Fang Yang, Kevin Lin, and Chu-Song Chen, “Supervised Learning of Semantics-Preserving Hash via Deep Convolutional Neural Networks”.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE.
    [38] A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images,” 2009.
    [39] Peng Li, Meng Wang, Jian Cheng, Changsheng Xu, Hanqing Lu, “Spectral hashing with semantically consistent graph for image indexing,” IEEE Transactions on Multimedia, pp. 141-152, 2013.
    [40] Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, Yan-Tao Zheng, “NUS-WIDE: A Real-World Web Image Database from National University of Singapore,” ACM International Conference on Image and Video Retrieval (CIVR), 2009.
    [41] Venice Erin Liong, Jiwen Lu, Gang Wang, Pierre Moulin, Jie Zhou, “Deep hashing for compact binary codes learning,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2475-2483, 2015.

    QR CODE