簡易檢索 / 詳目顯示

研究生: 莊士頡
Shih-Chieh Chuang
論文名稱: 基於加權對比學習和偽標籤修正的連續測試時間適應
Continual Test-Time Adaptation with Weighted Contrastive Learning and Pseudo-Label Correction
指導教授: 陸敬互
Ching-Hu Lu
口試委員: 蘇順豐
Shun-Feng Su
鍾聖倫
Sheng-Luen Chung
黃正民
Cheng-Ming Huang
李俊賢
Jin-Shyan Lee
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 92
中文關鍵詞: 深度學習領域偏移測試時間適應偽標籤修正對比學習邊緣運算影像辨識
外文關鍵詞: deep learning, domain shift, test-time adaptation, pseudo-label correction, contrastive learning, edge computing, image recognition
相關次數: 點閱:303下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

近年來,由於影像處理所需的複雜計算,電腦視覺服務在實時性方面面臨挑戰。隨著人工智慧物聯網(AIoT)技術的快速發展,結合邊緣計算與AIoT的攝影機 (本研究稱為邊緣攝影機),在應用領域上取得了顯著成功。然而,這些系統在不斷變化且多樣的環境中運作,往往會引起領域偏移問題,導致模型的準確度逐漸受到影響,因此系統需具備即時適應的能力。雖有研究提出了連續測試時間適應以解決上述的問題,然而既有方法往往依賴準確度高的偽標籤,且既有對比學習方法僅考量同類特徵的聚合,但卻未考慮同類內相似特徵更進一步聚合的問題,因此本研究提出「權重對比學習」並應用於源域預訓練模型以及連續測試時間適應的主要訓練過程中。此外,在連續的適應過程中可能會發生災難性遺忘的情況。為了解決這個問題,既有研究提出了透過源域參數隨機覆蓋的方式。然而,在領域偏移程度較大的情況下,這種方法可能導致源域知識變為雜訊,進而影響模型的適應能力。因此本研究提出「領域感知偽標籤修正」技術,在不需要再次訪問源域資料的狀況下,可以解決災難性遺忘和錯誤累積的問題,同時確保對模型適應性能的影響盡量縮小。根據實驗結果顯示,使用加權對比學習訓練的源域預訓練模型在CIFAR10和CIFAR100數據集上的準確度分別提升了0.46%和1.88%。在連續測試時間適應的情境中,將CIFAR10適應到CIFAR-10C的場景中,相較於既有研究,平均錯誤率下降了17.78%。同樣地,將CIFAR100適應到CIFAR-100C的場景中,平均錯誤率也下降了2.73%。且當系統布署於邊緣攝影機(使用NVIDIA Jetson TX2),每秒幀率能夠提升26.42%。


In recent years, computer vision services have encountered challenges due to the complexity of image processing and the advances in computing power. The combination of edge computing and AIoT-enabled cameras, known as edge cameras, has seen significant success in various tasks. However, the model on an edge camera requires real-time adaptability to maintain system accuracy in the face of domain shifts caused by constantly changing environments. While continuous test-time adaptation has been proposed to resolve such problems, existing methods rely on high-accuracy pseudo-labels. Moreover, current contrastive learning methods consider the aggregation of features from the same class but overlook the problem of aggregating similar features within the same class. Therefore, we propose "Weighted Contrastive Learning" and apply it in both pretraining and continuous test-time adaptation. To address the catastrophic forgetting caused by continuous adaptation, existing research has made use of source domain knowledge to stochastically recover to the target domain one. However, significant domain shifts may cause the source domain knowledge to become noise, thus impacting the model's adaptability. Therefore, we propose "Domain-aware Pseudo-label Correction" to mitigate catastrophic forgetting and error accumulation without needing to access the original source-domain data, while minimizing the impact on model adaptability. The experimental results show that using WeiContrast for pretraining improved accuracy by 0.46% and 1.88% on CIFAR10 and CIFAR100 datasets, respectively. In continuous test-time adaptation, adapting CIFAR10 to CIFAR-10C resulted in a 17.78% reduction in the average error rate compared to existing research. Adapting CIFAR100 to CIFAR-100C reduced the average error rate by 2.73%. On an edge camera platform, the FPS increased by 26.42%.

中文摘要 I Abstract II 致謝 III 目錄 IV 圖目錄 VII 表格目錄 IX 第一章 簡介 1 1.1 研究動機 1 1.2 文獻探討 5 1.2.1 固定目標域無源領域適應 5  離線訓練時間適應 5  線上測試時間適應 7 1.2.2 連續目標域測試時間適應 10  「忽略相似特徵的聚合」的議題 13  「忽略領域感知的記憶恢復」的議題 14 1.3 本研究貢獻與文章架構 17 第二章 系統設計理念與架構簡介 19 2.1 系統應用情境 19 2.2 系統架構簡介 20 第三章 加權對比學習 22 3.1 對比學習架構 22 3.1.1 對比學習下的數據增強 24 3.1.2 特徵提取網路 26 3.1.3 投影網路 26 3.2 對比損失函數 27 3.2.1 自監督對比損失 28 3.2.2 監督對比損失 29 3.2.3 加權對比損失 29 3.3 下游分類任務訓練 32 第四章 基於加權對比學習和偽標籤修正的連續測試時間適應 34 4.1 連續測試時間適應 35 4.1.1 連續測試時間適應之定義 35 4.1.2 問題與挑戰 37 4.2 平均教師模型 37 4.3 測試時間數據增強 39 4.4 領域偏移偵測 40 4.5 基於加權對比學習和偽標籤修正的連續測試時間適應 41 4.5.1 平均教師與學生模型之適應 41  學生模型適應 41  教師模型適應 43 4.5.2 領域感知偽標籤修正模組 44 第五章 實驗結果與討論 47 5.1 實驗平台 47 5.2 源域預訓練模型 47 5.2.1 實驗資料集與評估指標 48 5.2.2 網路訓練設定與流程 48 5.2.3 針對對比學習架構的抽換實驗 49  特徵提取網路選擇 49  投影網路抽換實驗 50 5.2.4 加權對比學習之權重抽換實驗 50 5.2.5 相關研究比較 52 5.3 測試時間適應模型 55 5.3.1 實驗資料集與評估指標 55 5.3.2 網路訓練設定與流程 55 5.3.3 閥值超參數 α 實驗 56 5.3.4 測試時間適應閥值設置實驗 59 5.3.5 對比學習抽換實驗 60 5.3.6 邊緣裝置實驗 61 5.3.7 相關研究比較 64 第六章 結論與未來研究方向 69 參考文獻 71 口試委員之建議與回覆 76

[1] K. Cao, Y. Liu, G. Meng, and Q. Sun, "An overview on edge computing research," IEEE access, vol. 8, pp. 85714-85728, 2020.
[2] H. Li, K. Ota, and M. Dong, "Learning IoT in edge: Deep learning for the Internet of Things with edge computing," IEEE network, vol. 32, no. 1, pp. 96-101, 2018.
[3] D. Aishwarya and R. Minu, "Real-Time Surveillance Video Analytics: A Survey on the Computing Infrastructures," Advances in Data and Information Sciences, pp. 249-259, 2023.
[4] D. Wang, E. Shelhamer, S. Liu, B. Olshausen, and T. Darrell, "Tent: Fully test-time adaptation by entropy minimization," arXiv preprint arXiv:2006.10726, 2020.
[5] D. Hendrycks and T. Dietterich, "Benchmarking neural network robustness to common corruptions and perturbations," arXiv preprint arXiv:1903.12261, 2019.
[6] Q. Wang, O. Fink, L. Van Gool, and D. Dai, "Continual test-time domain adaptation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7201-7211.
[7] X. Chen and K. He, "Exploring simple siamese representation learning," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 15750-15758.
[8] K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, "Momentum contrast for unsupervised visual representation learning," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 9729-9738.
[9] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, "A simple framework for contrastive learning of visual representations," in International conference on machine learning, 2020: PMLR, pp. 1597-1607.
[10] P. Khosla et al., "Supervised contrastive learning," Advances in neural information processing systems, vol. 33, pp. 18661-18673, 2020.
[11] M. Zheng et al., "Weakly supervised contrastive learning," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10042-10051.
[12] Y. Fang, P.-T. Yap, W. Lin, H. Zhu, and M. Liu, "Source-Free Unsupervised Domain Adaptation: A Survey," arXiv preprint arXiv:2301.00265, 2022.
[13] R. Li, Q. Jiao, W. Cao, H.-S. Wong, and S. Wu, "Model adaptation: Unsupervised domain adaptation without source data," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 9641-9650.
[14] V. K. Kurmi, V. K. Subramanian, and V. P. Namboodiri, "Domain impression: A source data free domain adaptation method," in Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2021, pp. 615-625.
[15] Y. Ganin and V. Lempitsky, "Unsupervised domain adaptation by backpropagation," in International conference on machine learning, 2015: PMLR, pp. 1180-1189.
[16] J. Liang, D. Hu, and J. Feng, "Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation," in International Conference on Machine Learning, 2020: PMLR, pp. 6028-6039.
[17] N. Ding, Y. Xu, Y. Tang, C. Xu, Y. Wang, and D. Tao, "Source-free domain adaptation via distribution estimation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7212-7222.
[18] J. Wang et al., "Generalizing to unseen domains: A survey on domain generalization," IEEE Transactions on Knowledge and Data Engineering, 2022.
[19] Y. Sun, X. Wang, Z. Liu, J. Miller, A. Efros, and M. Hardt, "Test-time training with self-supervision for generalization under distribution shifts," in International conference on machine learning, 2020: PMLR, pp. 9229-9248.
[20] X. Hu et al., "Mixnorm: Test-time adaptation through online normalization estimation," arXiv preprint arXiv:2110.11478, 2021.
[21] Y. Iwasawa and Y. Matsuo, "Test-time classifier adjustment module for model-agnostic domain generalization," Advances in Neural Information Processing Systems, vol. 34, pp. 2427-2440, 2021.
[22] C. K. Mummadi, R. Hutmacher, K. Rambach, E. Levinkov, T. Brox, and J. H. Metzen, "Test-time adaptation to distribution shift by confidence maximization and input transformation," arXiv preprint arXiv:2106.14999, 2021.
[23] D. Chen, D. Wang, T. Darrell, and S. Ebrahimi, "Contrastive test-time adaptation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 295-305.
[24] M. Jang and S.-Y. Chung, "Test-time adaptation via self-training with nearest neighbor information," arXiv preprint arXiv:2207.10792, 2022.
[25] A. Gammerman, V. Vovk, and V. Vapnik, "Learning by transduction," arXiv preprint arXiv:1301.7375, 2013.
[26] Y. Li, N. Wang, J. Shi, J. Liu, and X. Hou, "Revisiting batch normalization for practical domain adaptation," arXiv preprint arXiv:1603.04779, 2016.
[27] S. Schneider, E. Rusak, L. Eck, O. Bringmann, W. Brendel, and M. Bethge, "Improving robustness against common corruptions by covariate shift adaptation," Advances in Neural Information Processing Systems, vol. 33, pp. 11539-11551, 2020.
[28] D. Mugnai, F. Pernici, F. Turchini, and A. Del Bimbo, "Soft pseudo-labeling semi-supervised learning applied to fine-grained visual classification," in Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10–15, 2021, Proceedings, Part IV, 2021: Springer, pp. 102-110.
[29] E. Rusak et al., "If your data distribution shifts, use self-learning," Transactions on Machine Learning Research, 2021.
[30] Q. Xie, M.-T. Luong, E. Hovy, and Q. V. Le, "Self-training with noisy student improves imagenet classification," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 10687-10698.
[31] D.-H. Lee, "Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks," in Workshop on challenges in representation learning, ICML, 2013, vol. 3, no. 2, p. 896.
[32] Z. Zhang and M. Sabuncu, "Generalized cross entropy loss for training deep neural networks with noisy labels," Advances in neural information processing systems, vol. 31, 2018.
[33] A. Tarvainen and H. Valpola, "Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results," Advances in neural information processing systems, vol. 30, 2017.
[34] C. Chen et al., "Progressive feature alignment for unsupervised domain adaptation," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 627-636.
[35] T. Gong, J. Jeong, T. Kim, Y. Kim, J. Shin, and S.-J. Lee, "NOTE: Robust Continual Test-time Adaptation Against Temporal Correlation," in Advances in Neural Information Processing Systems.
[36] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, "Vision meets robotics: The kitti dataset," The International Journal of Robotics Research, vol. 32, no. 11, pp. 1231-1237, 2013.
[37] A. Logacjov, K. Bach, A. Kongsvold, H. B. Bårdstu, and P. J. Mork, "HARTH: a human activity recognition dataset for machine learning," Sensors, vol. 21, no. 23, p. 7853, 2021.
[38] D. Ulyanov, A. Vedaldi, and V. Lempitsky, "Instance normalization: The missing ingredient for fast stylization," arXiv preprint arXiv:1607.08022, 2016.
[39] T. Wang and P. Isola, "Understanding contrastive representation learning through alignment and uniformity on the hypersphere," in International Conference on Machine Learning, 2020: PMLR, pp. 9929-9939.
[40] M. McCloskey and N. J. Cohen, "Catastrophic interference in connectionist networks: The sequential learning problem," in Psychology of learning and motivation, vol. 24: Elsevier, 1989, pp. 109-165.
[41] G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, "Continual lifelong learning with neural networks: A review," Neural networks, vol. 113, pp. 54-71, 2019.
[42] S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, "icarl: Incremental classifier and representation learning," in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 2001-2010.
[43] Z. Li and D. Hoiem, "Learning without forgetting," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 12, pp. 2935-2947, 2017.
[44] A. Mallya and S. Lazebnik, "Packnet: Adding multiple tasks to a single network by iterative pruning," in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2018, pp. 7765-7773.
[45] J. Lu, A. Liu, F. Dong, F. Gu, J. Gama, and G. Zhang, "Learning under concept drift: A review," IEEE transactions on knowledge and data engineering, vol. 31, no. 12, pp. 2346-2363, 2018.
[46] M. Baena-Garcıa, J. del Campo-Ávila, R. Fidalgo, A. Bifet, R. Gavalda, and R. Morales-Bueno, "Early drift detection method," in Fourth international workshop on knowledge discovery from data streams, 2006, vol. 6: Citeseer, pp. 77-86.
[47] T. Dasu, S. Krishnan, S. Venkatasubramanian, and K. Yi, "An information-theoretic approach to detecting changes in multi-dimensional data streams," in Proc. Symposium on the Interface of Statistics, Computing Science, and Applications (Interface), 2006.
[48] H. Wang and Z. Abraham, "Concept drift detection for streaming data," in 2015 international joint conference on neural networks (IJCNN), 2015: IEEE, pp. 1-9.
[49] G. Koch, R. Zemel, and R. Salakhutdinov, "Siamese neural networks for one-shot image recognition," in ICML deep learning workshop, 2015, vol. 2, no. 1: Lille.
[50] E. Hoffer and N. Ailon, "Deep metric learning using triplet network," in Similarity-Based Pattern Recognition: Third International Workshop, SIMBAD 2015, Copenhagen, Denmark, October 12-14, 2015. Proceedings 3, 2015: Springer, pp. 84-92.
[51] Y. Tian, C. Sun, B. Poole, D. Krishnan, C. Schmid, and P. Isola, "What makes for good views for contrastive learning?," Advances in neural information processing systems, vol. 33, pp. 6827-6839, 2020.
[52] D. Hendrycks, N. Mu, E. D. Cubuk, B. Zoph, J. Gilmer, and B. Lakshminarayanan, "Augmix: A simple data processing method to improve robustness and uncertainty," arXiv preprint arXiv:1912.02781, 2019.
[53] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[54] Y. Ge, D. Chen, and H. Li, "Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification," arXiv preprint arXiv:2001.01526, 2020.
[55] C. Shorten and T. M. Khoshgoftaar, "A survey on image data augmentation for deep learning," Journal of big data, vol. 6, no. 1, pp. 1-48, 2019.
[56] G. Cohen and R. Giryes, "Simple Post-Training Robustness Using Test Time Augmentations and Random Forest," arXiv preprint arXiv:2109.08191, 2021.
[57] J. Gama, P. Medas, G. Castillo, and P. Rodrigues, "Learning with drift detection," in Advances in Artificial Intelligence–SBIA 2004: 17th Brazilian Symposium on Artificial Intelligence, Sao Luis, Maranhao, Brazil, September 29-Ocotber 1, 2004. Proceedings 17, 2004: Springer, pp. 286-295.
[58] A. Krizhevsky and G. Hinton, "Learning multiple layers of features from tiny images," 2009.
[59] L. Bottou, "Large-scale machine learning with stochastic gradient descent," in Proceedings of COMPSTAT'2010: 19th International Conference on Computational StatisticsParis France, August 22-27, 2010 Keynote, Invited and Contributed Papers, 2010: Springer, pp. 177-186.
[60] X. Chen, H. Fan, R. Girshick, and K. He, "Improved baselines with momentum contrastive learning," arXiv preprint arXiv:2003.04297, 2020.
[61] Z. Shi et al., "The trade-off between universality and label efficiency of representations from contrastive learning," arXiv preprint arXiv:2303.00106, 2023.
[62] A. Miyai, Q. Yu, D. Ikami, G. Irie, and K. Aizawa, "Rethinking Rotation in Self-Supervised Contrastive Learning: Adaptive Positive or Negative Data Augmentation," in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 2809-2818.
[63] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.

QR CODE