簡易檢索 / 詳目顯示

研究生: 鍾杰廷
Chieh-Ting Chung
論文名稱: 基於拆分注意塊的深度卷積神經網路行人再識別之研究
A Study of Person Re-Identification Base on Deep Convolutional Neural Network with Split-Attention Block
指導教授: 吳怡樂
Yi-Leh Wu
口試委員: 唐政元
Zheng-Yuan Tang
陳建中
Jian-Zhong Chen
閻立剛
Li-Gang Yan
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 41
中文關鍵詞: 深度學習卷積神經網路行人再識別
外文關鍵詞: Deep Learning, Convolution Neural Network, Person Re-Identification
相關次數: 點閱:245下載:9
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,受惠於硬體效能的強大以及深度學習相關研究的進步,在電腦視覺衍生出許多實用的應用,且已經與我們的生活息息相關了。然而,隨著 person re-identification (re-ID) 的快速發展,這個技術可應用在多個場景,例如尋人系統、無人商店、輔助人臉辨識等等。研究的領域上,有許多人提出新的網路架構,其中採用 ResNet 的 Batch DropBlock (BDB) 網路架構在 person re-ID 任務中獲得良好的成績。本論文受 ResNeSt 啟發,因此我們採用最先進的 ResNeSt 當作特徵擷取器以及調整網路的維度,期望在 BDB 中有更好的表現。實驗中,我們採用 Market-1501、CUHK03-Label、CUHK03-Detect、DukeMTMC-reID 四種常見的 person re-ID 資料集進行實驗。實驗表明,我們設計的網路架構與 BDB 相比,評估指標 Rank-1 與 mAP 獲得更好的結果。


    In recent years, benefited from the powerful hardware performance and the advance of related deep learning research. Computer vision has derived many practical applications and has been closely related to our lives. However, following the rapid development of person re-identification (re-ID), this technology can be applied in many scenarios, such as person tracking systems, unmanned stores, assisted face recognition, etc. In the field of research, many new network architectures have been proposed, among which the Batch DropBlock (BDB) Network using ResNet has obtained good results in the person re-ID task. This paper is inspired by ResNeSt, so we use the state-of-the-art ResNeSt as a feature extraction and adjust the dimension of the network, expecting a better performance in BDB Network. In the experiment, we use four common person re-ID datasets for experiments, such as Market-1501, CUHK03-Label, CUHK03-Detect, and DukeMTMC-reID. The experiments show that our designed network architecture obtains more accurate results with evaluation metrics the Rank-1 and the mAP compared to BDB Network.

    論文摘要 i Abstract ii Contents iii LIST OF FIGURES iv LIST OF TABLES v Chapter 1. Introduction 1 1.1 Research Background 1 1.2 Research Motivation 2 Chapter 2. Related Work 3 2.1 Image Retrieval 3 2.2 Data Augmentation 4 2.3 Deep Residual Network 5 2.4 ResNeSt 7 Chapter 3. Proposed Method 8 3.1 Baseline Architecture 8 3.2 Proposed Architecture 9 Chapter 4. Experiments 10 4.1 Dataset 10 4.1.1 Market-1501 10 4.1.2 CUHK03 10 4.1.3 DukeMTMC-reID 11 4.2 Evaluation Metrics and Training Details 12 4.2.1 Evaluation Metrics 12 4.2.2 Training details 12 4.3 Overall Performance Comparison 13 4.4 Performance of the ResNeSt Feature Extraction 16 4.5 The Impact of Down-Sampling of the ResNeSt 18 4.6 The Impact of Erased Height Ratio 20 4.7 The Impact of Number of Fully Connected Layers in the Feature Dropping Branch 23 4.8 The Impact of Number of Fully Connected Layers in the Global Branch 25 4.9 The Impact of Number of Fully Connected Layers in the Global Branch and the Feature Dropping Branch 27 Chapter 5. Conclusions and Future Work 30 References 31

    [1] Zhun Zhong, Liang Zheng, Donglin Cao, and Shaozi Li, "Re-ranking person re-identification with k-reciprocal encoding," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1318-1327, 2017.
    [2] Wei Li, Rui Zhao, Tong Xiao, and Xiaogang Wang, "Deepreid: Deep filter pairing neural network for person re-identification," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 152-159, 2014.
    [3] Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian, "Scalable person re-identification: A benchmark," in Proceedings of the IEEE international conference on computer vision, pp. 1116-1124, 2015.
    [4] Ergys Ristani, Francesco Solera, Roger S. Zou, Rita Cucchiara, and Carlo Tomasi, "Performance measures and a data set for multi-target, multi-camera tracking," in European conference on computer vision, pp. 17-35, 2016.
    [5] Zuozhuo Dai, Mingqiang Chen, Xiaodong Gu, Siyu Zhu, and Ping Tan, "Batch dropblock network for person re-identification and beyond," in Proceedings of the IEEE/CVF international conference on computer vision, pp. 3691-3701, 2019.
    [6] Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and Shengjin Wang, "Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline)," in Proceedings of the European conference on computer vision (ECCV), pp. 480-496, 2018.
    [7] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
    [8] Haiyu Zhao, Maoqing Tian, Shuyang Sun, Jing Shao, Junjie Yan, Shuai Yi, Xiaogang Wang, and Xiaoou Tang, "Spindle net: Person re-identification with human body region guided feature decomposition and fusion," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1077-1085, 2017.
    [9] Xuan Zhang, Hao Luo, Xing Fan, Weilai Xiang, Yixiao Sun, Qiqi Xiao, Wei Jiang, Chi Zhang, and Jian Sun, "Alignedreid: Surpassing human-level performance in person re-identification," arXiv preprint arXiv:1711.08184, 2017.
    [10] Xihui Liu, Haiyu Zhao, Maoqing Tian, Lu Sheng, Jing Shao, Shuai Yi, Junjie Yan, and Xiaogang Wang, "Hydraplus-net: Attentive deep features for pedestrian analysis," in Proceedings of the IEEE international conference on computer vision, pp. 350-359, 2017.
    [11] Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang, "Random erasing data augmentation," in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 07, pp. 13001-13008, 2020.
    [12] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, "Generative adversarial nets," in Advances in neural information processing systems, pp. 2672-2680, 2014.
    [13] Yannis Kalantidis, Clayton Mellina, and Simon Osindero, "Cross-dimensional weighting for aggregated deep convolutional features," in European conference on computer vision, pp. 685-701, 2016.
    [14] Florent Perronnin, and Diane Larlus, "Fisher vectors meet neural networks: A hybrid classification architecture," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3743-3752, 2015.
    [15] Yunchao Gong, Liwei Wang, Ruiqi Guo, and Svetlana Lazebnik, "Multi-scale orderless pooling of deep convolutional activation features," in European conference on computer vision, pp. 392-407, 2014.
    [16] Liang Zheng, Yi Yang, and Alexander G. Hauptmann, "Person re-identification: Past, present and future," arXiv preprint arXiv:1610.02984, 2016.
    [17] James Philbin, Ondrej Chum, Michael Isard, Josef Sivic, and Andrew Zisserman, "Object retrieval with large vocabularies and fast spatial matching," in 2007 IEEE conference on computer vision and pattern recognition, pp. 1-8, 2007.
    [18] Herve Jegou, Matthijs Douze, and Cordelia Schmid, "Hamming embedding and weak geometric consistency for large scale image search," in European conference on computer vision, pp. 304-317, 2008.
    [19] Artem Babenko, Anton Slesarev, Alexandr Chigorin, and Victor Lempitsky, "Neural codes for image retrieval," in European conference on computer vision, pp. 584-599, 2014.
    [20] Giorgos Tolias, Ronan Sicre, and Hervé Jégou, "Particular object retrieval with integral max-pooling of CNN activations," arXiv preprint arXiv:1511.05879, 2015.
    [21] Niloofar Gheissari, Thomas B. Sebastian, and Richard Hartley, "Person reidentification using spatiotemporal appearance," in 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06), vol. 2, pp. 1528-1535, 2006.
    [22] Loris Bazzani, Marco Cristani, Alessandro Perina, Michela Farenzena, and Vittorio Murino, "Multiple-shot person re-identification by hpe signature," in 2010 20th International Conference on Pattern Recognition, pp. 1413-1416, 2010.
    [23] Wei Li, Rui Zhao, Tong Xiao, and Xiaogang Wang, "Deepreid: Deep filter pairing neural network for person re-identification," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 152-159, 2014.
    [24] Hang Zhang, Chongruo Wu, Zhongyue Zhang, Yi Zhu, Haibin Lin, Zhi Zhang, Yue Sun, Tong He, Jonas Mueller, R. Manmatha, Mu Li, and Alexander Smola, "Resnest: Split-attention networks," arXiv preprint arXiv:2004.08955, 2020.
    [25] Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, and Hartwig Adam, "Searching for mobilenetv3," in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314-1324, 2019.
    [26] Zhedong Zheng, Liang Zheng, and Yi Yang, "Unlabeled samples generated by gan improve the person re-identification baseline in vitro," in Proceedings of the IEEE international conference on computer vision, pp. 3754-3762, 2017.
    [27] Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun, "Shufflenet v2: Practical guidelines for efficient cnn architecture design," in Proceedings of the European conference on computer vision (ECCV), pp. 116-131, 2018.
    [28] Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He, "Aggregated residual transformations for deep neural networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1492-1500, 2017.
    [29] Jie Hu, Li Shen, Samuel Albanie, Gang Sun, and Enhua Wu, "Squeeze-and-excitation networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132-7141, 2018.
    [30] Xiang Li, Wenhai Wang, Xiaolin Hu, and Jian Yang, "Selective kernel networks," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 510-519, 2019.

    QR CODE