簡易檢索 / 詳目顯示

研究生: 王逸翔
Yi-Hsiang Wang
論文名稱: 基於原型記憶庫之難樣本學習於無監督領域自適應行人重識別
Hard Sample Learning in Prototype-based Memory of Unsupervised Domain Adaptation for Person Re-identification
指導教授: 蘇順豐
Shun-Feng Su
口試委員: 鍾聖倫
Sheng-Luen Chung
郭重顯
Chung-Hsien Kuo
陳美勇
Mei-Yung Chen
黃有評
Yo-Ping Huang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 77
中文關鍵詞: 行人重識別無監督領域自適應領域差異對比學習偽標籤記憶庫
外文關鍵詞: Person Re-identification, Unsupervised Domain Adaptation, Domain Gap, Contrastive Learning, Pseudo Label, Memory Bank
相關次數: 點閱:272下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

無監督領域自適應行人重識別(Unsupervised Domain Adaptation in Person Re-identification)是一項任務,其目的是將預訓練在有標註源域的模型推廣到未標註的目標域。基於偽標籤(pseudo label)的方法在先進技術中占主導地位,它們採用聚類演算法(clustering algorithm)並且為目標數據分配偽標籤,但由於領域的差異而受到噪聲標籤(noisy labels)的影響。一個基於記憶的方法提出了一種自定進度(self-paced)的策略來解決噪聲標籤的問題,並採用對比學習(contrastive learning)的方法,即學習提取的特徵與記憶庫(memory bank)中相對應原型(prototype)之間的內在特徵表示。然而,我們認為學習過程會受到原型的選擇和記憶庫維護極大的影響。平均化的特徵作為一個原型可能會包含有噪聲標籤和具有不同更新速度的特徵。為了克服這個問題,我們提出了基於原型的記憶庫(prototype-based memory)為每個類和聚類儲存單一的原型,並透過當前批次(batch)的最新特徵來更新原型。單一原型減少了含有噪聲標籤的機會,並消除了原型內更新頻率的不一致。此外,我們還考慮了最難的樣本選擇(hardest sample selection),在一個批次中根據它們的正樣本對來選擇最難的樣本。在限制的範圍內最難的樣本使編碼器(encoder)學會了辨別性的特徵。結果表明,我們提出的方法在Market-1501、DukeMTMC-reID和MSMT17資料集其中的四個領域適應性任務上取得了74.9%, 85.7%, 30.6%, 32.0% mAP, and 86.0%, 93.7%, 59.3%, 60.4% top-1 accuracy。


Unsupervised domain adaptation in person re-identification is a task that is aimed at generalizing the model pre-trained on the labeled source domain to the unlabeled target domain. Pseudo-label-based methods dominate the state-of-the-arts in this task. Those methods adopt the clustering algorithm and assign pseudo labels to the target data, but suffer from noisy labels due to the domain gap. A memory-based approach is to consider a self-paced strategy to address the problem of noisy labels, to employ the contrastive learning, which learns the intrinsic feature representations between the extracted features and the corresponding prototypes in the memory bank, and has shown promising results. However, we argue that the learning process is greatly affected by the choice of prototypes and the maintenance of the memory bank. To average the features to be a prototype may contain noisy labels and features with different update speeds. In order to overcome the problems, the prototype-based memory is proposed to store the single prototype for each class/cluster and to update the prototype by the newest features coming from the current batch. To use a single prototype can reduce the chance of containing noisy labels and remove the inconsistent update frequency within the prototype. Moreover, the hardest sample selection is considered to select the hardest samples in a batch according to their positive pairs. Hardest samples in the restricted range make the encoder learn the discriminative representations. The results show that our proposed method achieves the state-of-the-art performance in 74.9%, 85.7%, 30.6%, 32.0% mAP, and 86.0%, 93.7%, 59.3%, 60.4% top-1 accuracy on four domain adaptation tasks of Market-1501, DukeMTMC-reID, and MSMT17.

中文摘要 Abstract 致謝 Table of Contents List of Figures List of Tables Chapter 1 Introduction 1.1 Background 1.2 Motivation 1.3 Baseline Model 1.4 Contributions 1.5 Thesis Organization Chapter 2 Related Works 2.1 Unsupervised Domain Adaptation for Person Re-ID 2.2 Self-Supervised Learning and Contrastive Learning Chapter 3 Methodology 3.1 Source Domain Pre-training 3.2 Clustering Algorithm 3.3 Network Architecture and Jointly Fine-tuning 3.4 Prototype-based Memory 3.4.1 Memory Initialization 3.4.2 Memory Update with Hardest Sample Selection 3.5 Prototype Contrastive Loss 3.6 Domain Specific Batch Normalization 3.7 Generalized Mean Pooling Chapter 4 Experiments 4.1 Datasets 4.1.1 Market-1501 4.1.2 DukeMTMC-reID 4.1.3 MSMT17 4.2 Evaluation Metrics and Protocol 4.2.1 Cumulative Matching Characteristic (CMC) Curve and Top-k Accuracy 4.2.2 Mean Average Precision (mAP) 4.2.3 Evaluation Protocol 4.3 Implementation Details 4.3.1 Network Architecture 4.3.2 Training Data Organization 4.3.3 Network Optimization 4.3.4 Environment 4.4 Comparison with State-of-the-arts 4.5 Parameter Analysis and Ablation Studies 4.5.1 Number of Features for Prototype Update 4.5.2 Complete Replacement and Momentum Coefficient m 4.5.3 Type of Single Prototype 4.5.4 Temperature Parameter τ in Prototype Contrastive Loss 4.5.5 Ablation Studies of Our Proposed Method 4.5.6 Ablation Studies of Other Components 4.6 Further Discussion 4.6.1 Effectiveness of Outliers 4.6.2 Combination of Positive Pairs and Negative Pairs 4.6.3 Attempt at Different Hard Sample Choice 4.6.4 Effect of Cluster Size 4.6.5 Comparison between End-to-end Training and Source Domain Pre-training 4.6.6 Comparison with the Similar Method Chapter 5 Conclusions and Future Work 5.1 Conclusions 5.2 Future Work References

[1] Y. Ge, D. Chen, F. Zhu, R. Zhao, and H. Li, "Self-paced contrastive learning with hybrid memory for domain adaptive object re-ID," ArXiv, vol. abs/2006.02713, 2020.
[2] L. Wei, S. Zhang, W. Gao, and Q. Tian, "Person transfer GAN to bridge domain gap for person re-identification," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 79-88, 2018.
[3] W. Deng, L. Zheng, G. Kang, Y. Yang, Q. Ye, and J. Jiao, "Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 994-1003, 2018.
[4] Z. Zhong, L. Zheng, S. Li, and Y. Yang, "Generalizing a person retrieval model hetero- and homogeneously," in ECCV, 2018.
[5] Y. Fu, Y. Wei, G. Wang, X. Zhou, H. Shi, and T. S. Huang, "Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person re-identification," 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6111-6120, 2019.
[6] Y. Ge, D. Chen, and H. Li, "Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification," ArXiv, vol. abs/2001.01526, 2020.
[7] W. Wang, F. Zhao, S. Liao, and L. Shao, "Attentive waveblock: Complementarity-enhanced mutual networks for unsupervised domain adaptation in person re-identification," ArXiv, vol. abs/2006.06525, 2020.
[8] Z. Zhong, L. Zheng, Z. Luo, S. Li, and Y. Yang, "Invariance matters: Exemplar memory for domain adaptive person re-identification," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 598-607, 2019.
[9] K. He, H. Fan, Y. Wu, S. Xie, and R. B. Girshick, "Momentum contrast for unsupervised visual representation learning," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9726-9735, 2020.
[10] T. Chen, S. Kornblith, M. Norouzi, and G. E. Hinton, "A simple framework for contrastive learning of visual representations," ArXiv, vol. abs/2002.05709, 2020.
[11] J.-B. Grill et al., "Bootstrap your own latent: A new approach to self-supervised learning," ArXiv, vol. abs/2006.07733, 2020.
[12] W.-G. Chang, T. You, S. Seo, S. Kwak, and B. Han, "Domain-specific batch normalization for unsupervised domain adaptation," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7346-7354, 2019.
[13] F. Radenovi , G. Tolias, and O. Chum, "Fine-tuning CNN image retrieval with no human annotation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, pp. 1655-1668, 2019.
[14] A. Hermans, L. Beyer, and B. Leibe, "In defense of the triplet loss for person re-identification," ArXiv, vol. abs/1703.07737, 2017.
[15] M. Ester, H. Kriegel, J. Sander, and X. Xu, "A density-based algorithm for discovering clusters in large spatial databases with noise," in KDD, 1996.
[16] A. v. d. Oord, Y. Li, and O. Vinyals, "Representation learning with contrastive predictive coding," ArXiv, vol. abs/1807.03748, 2018.
[17] T. Wang and P. Isola, "Understanding contrastive representation learning through alignment and uniformity on the hypersphere," in ICML, 2020.
[18] F. Dubourvieux, R. Audigier, A. Loesch, S. Ainouz, and S. Canu, "Unsupervised domain adaptation for person re-identification through source-guided pseudo-labeling," 2020 25th International Conference on Pattern Recognition (ICPR), pp. 4957-4964, 2021.
[19] L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, "Scalable person re-identification: A benchmark," 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1116-1124, 2015.
[20] Z. Zheng, L. Zheng, and Y. Yang, "Unlabeled samples generated by GAN improve the person re-identification baseline in vitro," 2017 IEEE International Conference on Computer Vision (ICCV), pp. 3774-3782, 2017.
[21] P. F. Felzenszwalb, R. B. Girshick, D. A. McAllester, and D. Ramanan, "Object detection with discriminatively trained part based models," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, pp. 1627-1645, 2009.
[22] S. Ren, K. He, R. B. Girshick, and J. Sun, "Faster R-CNN: Towards real-time object detection with region proposal networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, pp. 1137-1149, 2015.
[23] Z. Zhong, L. Zheng, D. Cao, and S. Li, "Re-ranking person re-identification with k-reciprocal encoding," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3652-3661, 2017.
[24] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, 2016.
[25] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "ImageNet: A large-scale hierarchical image database," in CVPR, 2009.
[26] H. Luo, Y. Gu, X. Liao, S. Lai, and W. Jiang, "Bag of tricks and a strong baseline for deep person re-identification," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1487-1495, 2019.
[27] Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang, "Random erasing data augmentation," ArXiv, vol. abs/1708.04896, 2020.
[28] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," CoRR, vol. abs/1412.6980, 2015.
[29] X. Fan, W. Jiang, H. Luo, and M. Fei, "SpherereID: Deep hypersphere manifold embedding for person re-identification," J. Vis. Commun. Image Represent., vol. 60, pp. 51-58, 2019.
[30] A. Paszke et al., "PyTorch: An imperative style, high-performance deep learning library," in NeurIPS, 2019.
[31] D. Wang and S. Zhang, "Unsupervised person re-identification via multi-label classification," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10978-10987, 2020.
[32] Z. Zhong, L. Zheng, Z. Luo, S. Li, and Y. Yang, "Learning to adapt invariance in memory for person re-identification," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, pp. 2723-2738, 2021.
[33] Y. Zhai et al., "Ad-cluster: Augmented discriminative clustering for domain adaptive person re-identification," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9018-9027, 2020.
[34] Y. Dai, J. Liu, Y. Bai, Z. Tong, and L.-y. Duan, "Dual-refinement: Joint label and feature refinement for unsupervised domain adaptive person re-identification," IEEE transactions on image processing : a publication of the IEEE Signal Processing Society, vol. PP, 2021.
[35] H. Feng, M. Chen, J. Hu, D. Shen, H. Liu, and D. Cai, "Complementary pseudo labels for unsupervised domain adaptation on person re-identification," IEEE Transactions on Image Processing, vol. 30, pp. 2898-2907, 2021.
[36] K. Zheng, C. Lan, W. Zeng, Z. Zhang, and Z. Zha, "Exploiting sample uncertainty for domain adaptive person re-identification," in AAAI, 2021.
[37] Z. Dai, G. Wang, S. Zhu, W. Yuan, and P. Tan, "Cluster contrast for unsupervised person re-identification," ArXiv, vol. abs/2103.11568, 2021.
[38] Y. Sun, L. Zheng, Y. Yang, Q. Tian, and S. Wang, "Beyond part models: Person retrieval with refined part pooling," in ECCV, 2018.
[39] G. Wang, Y. Yuan, X. Chen, J. Li, and X. Zhou, "Learning discriminative features with multiple granularities for person re-identification," Proceedings of the 26th ACM international conference on Multimedia, 2018.
[40] B. Xu, L. He, X. Liao, W. Liu, Z. Sun, and T. Mei, "Black re-id: A head-shoulder descriptor for the challenging problem of person re-identification," Proceedings of the 28th ACM International Conference on Multimedia, 2020.

無法下載圖示 全文公開日期 2024/09/29 (校內網路)
全文公開日期 2026/09/29 (校外網路)
全文公開日期 2026/09/29 (國家圖書館:臺灣博碩士論文系統)
QR CODE