Summary: | In this paper, we present an approach for football players pose estimation on very low-resolution images. The camera recording the football match is far away from the pitch in order to register at least half of it. As a result, even using very high resolution cameras, the image area presenting every single player is very small. Additionally, variable weather conditions or shadows and reflections, make this aim very hard. Such images are very hard to annotate by human. In our research we assume lack of manually annotated training data from our target distribution. Instead of manual annotation of large dataset, we create simple python script for rendering synthetic images with perfect annotations. Then we train vanilla CycleGAN (Cycle-consistent Generative Adversarial Networks) for transformation of raw synthetic images into more realistic. We use transformed images to train CPN (Cascaded Pyramid Networks) model. Without bells and whistles, we achieve similar precision on our images as the same CPN model trained with COCO (Common Objects in Context) keypoints dataset.
|