Unsupervised Learning for Structure from Motion

Perception of depth, ego-motion and robust keypoints is critical for SLAM andstructure from motion applications. Neural networks have achieved great perfor-mance in perception tasks in recent years. But collecting labeled data for super-vised training is labor intensive and costly. This thesis explo...

Full description

Bibliographic Details
Main Author: Örjehag, Erik
Format: Others
Language:English
Published: Linköpings universitet, Datorseende 2021
Subjects:
sfm
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173731
id ndltd-UPSALLA1-oai-DiVA.org-liu-173731
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-liu-1737312021-03-09T05:27:09ZUnsupervised Learning for Structure from MotionengÖrjehag, ErikLinköpings universitet, Datorseende2021sfmstructure from motiondepthego-motionunsupervised learningconsensus maximizationComputer SciencesDatavetenskap (datalogi)Perception of depth, ego-motion and robust keypoints is critical for SLAM andstructure from motion applications. Neural networks have achieved great perfor-mance in perception tasks in recent years. But collecting labeled data for super-vised training is labor intensive and costly. This thesis explores recent methodsin unsupervised training of neural networks that can predict depth, ego-motion,keypoints and do geometric consensus maximization. The benefit of unsuper-vised training is that the networks can learn from raw data collected from thecamera sensor, instead of labeled data. The thesis focuses on training on imagesfrom a monocular camera, where no stereo or LIDAR data is available. The exper-iments compare different techniques for depth and ego-motion prediction fromprevious research, and shows how the techniques can be combined successfully.A keypoint prediction network is evaluated and its performance is comparedwith the ORB detector provided by OpenCV. A geometric consensus network isalso implemented and its performance is compared with the RANSAC algorithmin OpenCV. The consensus maximization network is trained on the output of thekeypoint prediction network. For future work it is suggested that all networkscould be combined and trained jointly to reach a better overall performance. Theresults show (1) which techniques in unsupervised depth prediction are most ef-fective, (2) that the keypoint predicting network outperformed the ORB detector,and (3) that the consensus maximization network was able to classify outlierswith comparable performance to the RANSAC algorithm of OpenCV. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173731application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic sfm
structure from motion
depth
ego-motion
unsupervised learning
consensus maximization
Computer Sciences
Datavetenskap (datalogi)
spellingShingle sfm
structure from motion
depth
ego-motion
unsupervised learning
consensus maximization
Computer Sciences
Datavetenskap (datalogi)
Örjehag, Erik
Unsupervised Learning for Structure from Motion
description Perception of depth, ego-motion and robust keypoints is critical for SLAM andstructure from motion applications. Neural networks have achieved great perfor-mance in perception tasks in recent years. But collecting labeled data for super-vised training is labor intensive and costly. This thesis explores recent methodsin unsupervised training of neural networks that can predict depth, ego-motion,keypoints and do geometric consensus maximization. The benefit of unsuper-vised training is that the networks can learn from raw data collected from thecamera sensor, instead of labeled data. The thesis focuses on training on imagesfrom a monocular camera, where no stereo or LIDAR data is available. The exper-iments compare different techniques for depth and ego-motion prediction fromprevious research, and shows how the techniques can be combined successfully.A keypoint prediction network is evaluated and its performance is comparedwith the ORB detector provided by OpenCV. A geometric consensus network isalso implemented and its performance is compared with the RANSAC algorithmin OpenCV. The consensus maximization network is trained on the output of thekeypoint prediction network. For future work it is suggested that all networkscould be combined and trained jointly to reach a better overall performance. Theresults show (1) which techniques in unsupervised depth prediction are most ef-fective, (2) that the keypoint predicting network outperformed the ORB detector,and (3) that the consensus maximization network was able to classify outlierswith comparable performance to the RANSAC algorithm of OpenCV.
author Örjehag, Erik
author_facet Örjehag, Erik
author_sort Örjehag, Erik
title Unsupervised Learning for Structure from Motion
title_short Unsupervised Learning for Structure from Motion
title_full Unsupervised Learning for Structure from Motion
title_fullStr Unsupervised Learning for Structure from Motion
title_full_unstemmed Unsupervised Learning for Structure from Motion
title_sort unsupervised learning for structure from motion
publisher Linköpings universitet, Datorseende
publishDate 2021
url http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173731
work_keys_str_mv AT orjehagerik unsupervisedlearningforstructurefrommotion
_version_ 1719383136269238272