Goodpoint: unsupervised learning of key point detection and description
Subject of Research. The paper presents the study of algorithms for key point detection and description, widely used in computer vision. Typically, the corner detector acts as a key point detector, including neural key point detectors. For some types of images obtained in medicine, the application...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University)
2021-02-01
|
Series: | Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki |
Subjects: | |
Online Access: | https://ntv.ifmo.ru/file/article/20188.pdf |
Summary: | Subject of Research. The paper presents the study of algorithms for key point detection and description, widely used
in computer vision. Typically, the corner detector acts as a key point detector, including neural key point detectors. For
some types of images obtained in medicine, the application of such detectors is problematic due to the small number
of detected key points. The paper considers a problem of a neural network key point detector training on unlabeled
images. Method. We proposed the definition of key points not depending on specific visual features. A method was
considered for training of a neural network model meant for detecting and describing key points on unlabeled data. The
application of homographic image transformation was basic to the method. The neural network model was trained to
detect the same key points on pairs of noisy images related to a homographic transformation. Only positive examples
were used for detector training, just points correctly matched with features produced by the neural network model for
key point description. Main Results. The unsupervised learning algorithm is used to train the neural network model.
For the ease of comparison, the proposed model has a similar architecture and the same number of parameters as the
supervised model. Model evaluation is performed on the three different datasets: natural images, synthetic images, and
retinal photographs. The proposed model shows similar results to the supervised model on the natural images and better
results on retinal photographs. Improvement of results is demonstrated after additional training of the proposed model
on images from the target domain. This is an advantage over a model trained on a labeled dataset. For comparison, the
harmonic average of such metrics is used as: the accuracy and the depth of matching by descriptors, reproducibility
of key points and image coverage. Practical Relevance. The proposed algorithm makes it possible to train the neural
network key point detector together with the feature extraction model on images from the target domain without costly
dataset labeling and reduce labor costs for the development of the system that uses the detector. |
---|---|
ISSN: | 2226-1494 2500-0373 |