Author: |
Richmond Gagarin Manga Richmond Gagarin Manga |
---|---|
Thesis Title: |
ADAPTING FEATURES FOR FEW-SHOT LEARNING USING CONVOLUTIONAL ADAPTERS AND DIFFERENTIABLE SVM ADAPTING FEATURES FOR FEW-SHOT LEARNING USING CONVOLUTIONAL ADAPTERS AND DIFFERENTIABLE SVM |
Advisor: |
花凱龍
Kai-Lung Hua |
Committee: |
Arnulfo Azcarraga
Arnulfo Azcarraga 楊傳凱 Chuan-Kai Yang 陳駿丞 Jun-Cheng Chen |
Degree: |
碩士 Master |
Department: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
Thesis Publication Year: | 2020 |
Graduation Academic Year: | 108 |
Language: | 英文 |
Pages: | 43 |
Keywords (in Chinese): | few-shot learning 、machine learning 、computer vision 、deep learning |
Keywords (in other languages): | few-shot learning, machine learning, computer vision, deep learning |
Reference times: | Clicks: 710 Downloads: 11 |
Share: |
School Collection Retrieve National Library Collection Retrieve Error Report |
Many developments have been made on image classification using deep neural networks these past few years.
However, it is still a far from the human visual system's ability to classify images with only one or few examples.
Large amounts of data is not always feasible to obtain which is why there have been previous attempts to use deep learning for image classification with only a few set of examples called \textit{few-shot learning}.
Few-shot learning using deep neural networks is a difficult task, because training on a few set of examples is prone to over fitting.
Previous methods have attempted to transfer what the network has learned from a large data set to another with only a few examples per class.
However, these methods are prone to forgetting the information learned from the base classes, and thus cause it to over fit on the novel classes.
We propose a a novel method that freezes the weights learned for the base classes, and uses adapters to prevent the network from forgetting, and avoid over fitting. We also have a fine tuning stage that uses a differentiable SVM to improve the class boundaries.
Our method gives comparative results to the state of the art methods while only adding minimal parameters.
Many developments have been made on image classification using deep neural networks these past few years.
However, it is still a far from the human visual system's ability to classify images with only one or few examples.
Large amounts of data is not always feasible to obtain which is why there have been previous attempts to use deep learning for image classification with only a few set of examples called \textit{few-shot learning}.
Few-shot learning using deep neural networks is a difficult task, because training on a few set of examples is prone to over fitting.
Previous methods have attempted to transfer what the network has learned from a large data set to another with only a few examples per class.
However, these methods are prone to forgetting the information learned from the base classes, and thus cause it to over fit on the novel classes.
We propose a a novel method that freezes the weights learned for the base classes, and uses adapters to prevent the network from forgetting, and avoid over fitting. We also have a fine tuning stage that uses a differentiable SVM to improve the class boundaries.
Our method gives comparative results to the state of the art methods while only adding minimal parameters.
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Metric Learning . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 MetaLearning . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Domain Adaptation . . . . . . . . . . . . . . . . . . . . . 7
3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1 Episodic Sampling Task . . . . . . . . . . . . . . . . . . 8
3.2 Adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 13
v3.4 Learning Procedure . . . . . . . . . . . . . . . . . . . . . 15
3.4.1 Base Network PreTraining . . . . . . . . . . . . 15
3.4.2 Adapter Training for FewShot Learning . . . . . 15
3.4.3 Maximum Margin Refinement with Differentiable
SVM . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4.4 Inference . . . . . . . . . . . . . . . . . . . . . . 17
4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . 18
4.1.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . 18
4.1.2 Experimental Setting . . . . . . . . . . . . . . . . 18
4.1.3 Implementation Details . . . . . . . . . . . . . . . 19
4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.1 Accuracy . . . . . . . . . . . . . . . . . . . . . . 20
4.2.2 Efficiency and Optimization . . . . . . . . . . . . 22
4.3 Ablation . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3.1 Optimizing Adapter Design . . . . . . . . . . . . 24
4.3.2 Effectiveness of Maximum Margin Refinement . . 25
5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 28