Summary: | The zero-shot classification algorithm has been widely concerned in recent years, in which the labeling of samples of a new category is unnecessary and the cost of annotations can be reduced in applications. This paper presents a zero-shot method for image classification based on word vectors enhancement and distance metric learning. Specifically, the convolutional neural network (CNN) is employed to extract image feature vectors which have the same dimension as semantic feature vectors. Then, an unsupervised learning method is applied on Wikipedia corpus for extracting word vectors and the skip-gram is used to obtain word vectors. The model of analysis dictionary learning is improved by reducing redundant information in word vectors. The obtained sparse vectors are used as semantic features and a distance metric learning method is employed to measure the distance between image features and semantic features. Finally, the classification is implemented by a nearest neighbor based classifier. The effectiveness of the proposed algorithm is validated on the AwA and CUB data sets. Experimental results demonstrate that the proposed method has good performance in terms of both accuracy and robustness.
|