簡易檢索 / 詳目顯示

研究生: 蘇皓群
Hao-Chin Su
論文名稱: 當前任務變分自動編碼器結合動態集成損失函數之少樣本學習
CTVAE: Current Task Variational Auto-Encoder with Dynamic Ensemble Loss for Few-Shot Learning
指導教授: 陳怡伶
Yi-Ling Chen
口試委員: 葉彌妍
Mi-Yen Yeh
戴碧如
Bi-Ru Dai
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 36
中文關鍵詞: 少樣本學習機器學習變分自動編碼器動態集成損失函數
外文關鍵詞: Few-Shot Learning, Machine Learning, Variational Auto-Encoder, Dynamic Ensemble Loss
相關次數: 點閱:284下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 少樣本學習是一項具有挑戰性的任務,其中分類器需要快速適應新類別。這些新類別是在訓練階段是模型沒有見過的,並且在測試階段為每個新類別的學習過程只提供很少的樣本(例如,五個圖像)。當現有的方法使用如此少量的樣本進行學習時,它們很容易受到異常值的影響。此外,根據這幾個樣本計算出的類別中心可能會偏離真實的群中心。為了解決這些問題,我們提出了一種名為CurrentTask Variational Auto-Encoder (CTVAE)的新方法,用於少量學習。在我們的框架中,經過訓練的特徵萃取器首先會生成當前任務的特徵,這些特徵用於在CTVAE中重複訓練生成器。之後,我們可以使用CTVAE所生成的樣本特徵,然後根據這些新生成的特徵找到類別的新中心。此新的中心與原來的中心相比,新的中心在向量空間中往往更接近真實的中心。CTVAE可以打破傳統的少樣本學習方法的局限性,像是在測試階段只能用很少的樣本對模型進行微調。而且,通過直接生成特徵而不先生成圖像,簡化了CTVAE中的生成器的訓練過程,變得更加高效,並且可以更快更精確地生成特徵。根據在基準的資料集(即Mini-ImageNet、CUB和CIFAR-FS)上的實驗,我們提出的框架能夠勝過最先進的方法,並將準確率提高1-4%。我們還對跨域任務進行了實驗,結果表明所提出的框架可以帶來1-5%的準確率提升。


    Few-shot learning is a challenging task in which a classifier needs to quickly adapt to newclasses. These new classes are unseen in the training stage, and there are only few samples(e.g., five images) provided for learning each new class in the testing stage. When the ex-isting methods learn with such a small amount of samples, they could easily be affected bythe outliers. Moreover, the category center calculated from those few samples may deviatefrom the true center. To address these issues, we propose a novel approach called Cur-rent Task Variational Auto-Encoder (CTVAE) for few-shot learning. In our framework, atrained feature extractor first produces the features of current task, and these features areused to repeatedly train the generator in CTVAE. After that, we can use CTVAE to gen-erate additional features of samples, and then find a new center of the category based onthese newly generated features. Compared with the original center, the new center tends tobe closer to the true center in vector space. CTVAE can break the limitation of traditionalfew-shot learning methods, which can only fine-tune the model with very few samplesin the testing stage. Moreover, by generating the features directly without producing theimages first, the training process of the generator in CTVAE is simplified and becomesmore efficient, and the features can be generated faster and more precisely. Accordingto the experiments on benchmark datasets (i.e., Mini-ImageNet, CUB, and CIFAR-FS),our proposed framework is able to outperform the state-of-the-art methods and improvesthe accuracy by 1-4%. We also conduct experiments on the cross-domain tasks, and theresults show that the proposed framework can bring 1-5% accuracy improvements.

    Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . .iii Abstract in English . . . . . . . . . . . . . . . . . . . . . .iv Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . v Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Figures . . . . . . . . . . . . . . . . . . . . . . . .ix List of Tables . . . . . . . . . . . . . . . . . . . . . . . . x 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Related Works . . . . . . . . . . . . . . . . . . . . . . . .6 3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . 10 3.2 Overview of Framework . . . . . . . . . . . . . . . . . . 11 3.3 Training Feature Extractor . . . . . . . . . . . . . . . .14 3.4 Current Task Variational Auto-Encoder . . . . . . . . . .16 3.5 Learning with Dynamic Ensemble Loss . . . . . . . . . . . 19 4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . .21 4.1 Main Results . . . . . . . . . . . . . . . . . . . . . . 23 4.2 Detailed Analysis of Different Model Settings . . . . . . 25 4.3 Ablation Studies . . . . . . . . . . . . . . . . . . . . .28 5 Conclusion & Future Work . . . . . . . . . . . . . . . . . .32 References . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    [1]Luca Bertinetto, Joao F Henriques, Philip HS Torr, and Andrea Vedaldi. Meta-learning with differen-tiable closed-form solvers.arXiv preprint arXiv:1805.08136, 2018.
    [2]Malik Boudiaf, Ziko Imtiaz Masud, Jérôme Rony, José Dolz, Pablo Piantanida, and Ismail Ben Ayed.Transductive information maximization for few-shot learning.arXivpreprintarXiv:2008.11297, 2020.
    [3]Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, and Jia-Bin Huang. A closer lookat few-shot classification.arXiv preprint arXiv:1904.04232, 2019.
    [4]Guneet S Dhillon, Pratik Chaudhari, Avinash Ravichandran, and Stefano Soatto. A baseline for few-shot image classification.arXiv preprint arXiv:1909.02729, 2019.
    [5]Alexey Dosovitskiy, Jost Tobias Springenberg, Martin Riedmiller, and Thomas Brox. Discriminativeunsupervised feature learning with convolutional neural networks. Citeseer, 2014.
    [6]Li Fe-Fei et al. A bayesian approach to unsupervised one-shot learning of object categories. InPro-ceedings Ninth IEEE International Conference on Computer Vision, pages 1134–1141. IEEE, 2003.
    [7]Li Fei-Fei, Rob Fergus, and Pietro Perona. One-shot learning of object categories.IEEE transactionson pattern analysis and machine intelligence, 28(4):594–611, 2006.
    [8]Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation ofdeep networks. InInternational Conference on Machine Learning, pages 1126–1135. PMLR, 2017.
    [9]Victor Garcia and Joan Bruna. Few-shot learning with graph neural networks.arXiv preprint arXiv:1711.04043, 2017.
    [10]Spyros Gidaris, Praveer Singh, and Nikos Komodakis. Unsupervised representation learning by pre-dicting image rotations.arXiv preprint arXiv:1803.07728, 2018.
    [11]Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville. Improvedtraining of wasserstein gans.arXiv preprint arXiv:1704.00028, 2017.
    [12]Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, and Xilin Chen. Cross attention networkfor few-shot classification.arXiv preprint arXiv:1910.07677, 2019.
    [13]Yuqing Hu, Vincent Gripon, and Stéphane Pateux. Exploiting unsupervised inputs for accurate few-shot classification.arXiv preprint arXiv:2001.09849, 2020.
    [14]Yuqing Hu, Vincent Gripon, and Stéphane Pateux. Leveraging the feature distribution in transfer-basedfew-shot learning.arXiv preprint arXiv:2006.03806, 2020.
    [15]Jongmin Kim, Taesup Kim, Sungwoong Kim, and Chang D Yoo. Edge-labeling graph neural networkfor few-shot learning. InProceedings of the IEEE/CVF Conference on Computer Vision and PatternRecognition, pages 11–20, 2019.
    [16]Diederik P Kingma and Max Welling. Auto-encoding variational bayes.arXiv preprint arXiv:1312.6114, 2013.
    [17]Seong Min Kye, Hae Beom Lee, Hoirin Kim, and Sung Ju Hwang. Transductive few-shot learningwith meta-learned confidence.arXiv preprint arXiv:2002.12017, 2020.
    [18]Xinzhe Li, Qianru Sun, Yaoyao Liu, Shibao Zheng, Qin Zhou, Tat-Seng Chua, and Bernt Schiele.Learning to self-train for semi-supervised few-shot classification.arXiv preprint arXiv:1906.00562,2019.
    [19]Zhenguo Li, Fengwei Zhou, Fei Chen, and Hang Li. Meta-sgd: Learning to learn quickly for few-shotlearning.arXiv preprint arXiv:1707.09835, 2017.
    [20]Moshe Lichtenstein, Prasanna Sattigeri, Rogerio Feris, Raja Giryes, and Leonid Karlinsky. Tafssl:Task-adaptive feature sub-space learning for few-shot classification. InEuropeanConferenceonCom-puter Vision, pages 522–539. Springer, 2020.
    [21]JinluLiu, LiangSong, andYongqiangQin. Prototyperectificationforfew-shotlearning.arXivpreprintarXiv:1911.10713, 2019.
    [22]Yanbin Liu, Juho Lee, Minseop Park, Saehoon Kim, Eunho Yang, Sung Ju Hwang, and Yi Yang.Learning to propagate labels: Transductive propagation network for few-shot learning.arXiv preprintarXiv:1805.10002, 2018.
    [23]Puneet Mangla, Nupur Kumari, Abhishek Sinha, Mayank Singh, Balaji Krishnamurthy, and Vineeth NBalasubramanian. Charting the right manifold: Manifold mixup for few-shot learning. InProceedingsof the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2218–2227, 2020.
    [24]Sachin Ravi and Hugo Larochelle. Optimization as a model for few-shot learning. 2016.
    [25]Mengye Ren, Eleni Triantafillou, Sachin Ravi, Jake Snell, Kevin Swersky, Joshua B Tenenbaum, HugoLarochelle, and Richard S Zemel. Meta-learning for semi-supervised few-shot classification.arXivpreprint arXiv:1803.00676, 2018.
    [26]Oren Rippel, Manohar Paluri, Piotr Dollar, and Lubomir Bourdev. Metric learning with adaptivedensity discrimination.arXiv preprint arXiv:1511.05939, 2015.
    [27]Pau Rodríguez, Issam Laradji, Alexandre Drouin, and Alexandre Lacoste. Embedding propagation:Smoother manifold for few-shot classification. InEuropean Conference on Computer Vision, pages121–138. Springer, 2020.
    [28]Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang,Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. Imagenet large scale visual recognitionchallenge.International journal of computer vision, 115(3):211–252, 2015.
    [29]Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, and Timothy Lillicrap. Meta-learning with memory-augmented neural networks. InInternational conference on machine learning,pages 1842–1850. PMLR, 2016.
    [30]Jake Snell, Kevin Swersky, and Richard S Zemel. Prototypical networks for few-shot learning.arXivpreprint arXiv:1703.05175, 2017.
    [31]Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip HS Torr, and Timothy M Hospedales. Learn-ing to compare: Relation network for few-shot learning. InProceedings of the IEEE conference oncomputer vision and pattern recognition, pages 1199–1208, 2018.
    [32]Vladimir N Vapnik. An overview of statistical learning theory.IEEE transactions on neural networks,10(5):988–999, 1999.
    [33]Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, David Lopez-Paz,and Yoshua Bengio. Manifold mixup: Better representations by interpolating hidden states. InInter-national Conference on Machine Learning, pages 6438–6447. PMLR, 2019.
    [34]OriolVinyals, Charles Blundell, TimothyLillicrap, Koray Kavukcuoglu, and Daan Wierstra. Matchingnetworks for one shot learning.arXiv preprint arXiv:1606.04080, 2016.
    [35]Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie. The caltech-ucsdbirds-200-2011 dataset. 2011.
    [36]Yongqin Xian, Saurabh Sharma, Bernt Schiele, and Zeynep Akata. f-vaegan-d2: A feature generatingframework for any-shot learning. InProceedings of the IEEE/CVF Conference on Computer Visionand Pattern Recognition, pages 10275–10284, 2019.
    [37]Sergey Zagoruyko and Nikos Komodakis. Wide residual networks.arXiv preprint arXiv:1605.07146,2016.
    [38]Fengwei Zhou, Bin Wu, and Zhenguo Li. Deep meta-learning: Learning to learn in the concept space.arXiv preprint arXiv:11802.03596, 2018.36

    QR CODE