Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning

Visual simultaneous localization and mapping (VSLAM) plays a vital role in the field of positioning and navigation. At the heart of VSLAM is visual odometry (VO), which uses continuous images to estimate the camera’s ego-motion. However, due to many assumptions of the classical VO system, robots can...

Full description

Bibliographic Details
Main Authors:	Sumin Zhang, Shouyi Lu, Rui He, Zhipeng Bao
Format:	Article
Language:	English
Published:	MDPI AG 2021-07-01
Series:	Sensors
Subjects:	simultaneous localization and mapping (SLAM) visual odometry (VO) unsupervised deep learning pose correction
Online Access:	https://www.mdpi.com/1424-8220/21/14/4735

id	doaj-6ac0ed0173a54fdc9ded5cadca464450
record_format	Article
spelling	doaj-6ac0ed0173a54fdc9ded5cadca4644502021-07-23T14:05:31ZengMDPI AGSensors1424-82202021-07-01214735473510.3390/s21144735Stereo Visual Odometry Pose Correction through Unsupervised Deep LearningSumin Zhang0Shouyi Lu1Rui He2Zhipeng Bao3State Key Laboratory of Automotive Simulation and Control, Jilin University, Changchun 130022, ChinaState Key Laboratory of Automotive Simulation and Control, Jilin University, Changchun 130022, ChinaState Key Laboratory of Automotive Simulation and Control, Jilin University, Changchun 130022, ChinaState Key Laboratory of Automotive Simulation and Control, Jilin University, Changchun 130022, ChinaVisual simultaneous localization and mapping (VSLAM) plays a vital role in the field of positioning and navigation. At the heart of VSLAM is visual odometry (VO), which uses continuous images to estimate the camera’s ego-motion. However, due to many assumptions of the classical VO system, robots can hardly operate in challenging environments. To solve this challenge, we combine the multiview geometry constraints of the classical stereo VO system with the robustness of deep learning to present an unsupervised pose correction network for the classical stereo VO system. The pose correction network regresses a pose correction that results in positioning error due to violation of modeling assumptions to make the classical stereo VO positioning more accurate. The pose correction network does not rely on the dataset with ground truth poses for training. The pose correction network also simultaneously generates a depth map and an explainability mask. Extensive experiments on the KITTI dataset show the pose correction network can significantly improve the positioning accuracy of the classical stereo VO system. Notably, the corrected classical stereo VO system’s average absolute trajectory error, average translational relative pose error, and average translational root-mean-square drift on a length of 100–800 m in the KITTI dataset is 13.77 cm, 0.038 m, and 1.08%, respectively. Therefore, the improved stereo VO system has almost reached the state of the art.https://www.mdpi.com/1424-8220/21/14/4735simultaneous localization and mapping (SLAM)visual odometry (VO)unsupervised deep learningpose correction
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Sumin Zhang Shouyi Lu Rui He Zhipeng Bao
spellingShingle	Sumin Zhang Shouyi Lu Rui He Zhipeng Bao Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning Sensors simultaneous localization and mapping (SLAM) visual odometry (VO) unsupervised deep learning pose correction
author_facet	Sumin Zhang Shouyi Lu Rui He Zhipeng Bao
author_sort	Sumin Zhang
title	Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning
title_short	Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning
title_full	Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning
title_fullStr	Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning
title_full_unstemmed	Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning
title_sort	stereo visual odometry pose correction through unsupervised deep learning
publisher	MDPI AG
series	Sensors
issn	1424-8220
publishDate	2021-07-01
description	Visual simultaneous localization and mapping (VSLAM) plays a vital role in the field of positioning and navigation. At the heart of VSLAM is visual odometry (VO), which uses continuous images to estimate the camera’s ego-motion. However, due to many assumptions of the classical VO system, robots can hardly operate in challenging environments. To solve this challenge, we combine the multiview geometry constraints of the classical stereo VO system with the robustness of deep learning to present an unsupervised pose correction network for the classical stereo VO system. The pose correction network regresses a pose correction that results in positioning error due to violation of modeling assumptions to make the classical stereo VO positioning more accurate. The pose correction network does not rely on the dataset with ground truth poses for training. The pose correction network also simultaneously generates a depth map and an explainability mask. Extensive experiments on the KITTI dataset show the pose correction network can significantly improve the positioning accuracy of the classical stereo VO system. Notably, the corrected classical stereo VO system’s average absolute trajectory error, average translational relative pose error, and average translational root-mean-square drift on a length of 100–800 m in the KITTI dataset is 13.77 cm, 0.038 m, and 1.08%, respectively. Therefore, the improved stereo VO system has almost reached the state of the art.
topic	simultaneous localization and mapping (SLAM) visual odometry (VO) unsupervised deep learning pose correction
url	https://www.mdpi.com/1424-8220/21/14/4735
work_keys_str_mv	AT suminzhang stereovisualodometryposecorrectionthroughunsuperviseddeeplearning AT shouyilu stereovisualodometryposecorrectionthroughunsuperviseddeeplearning AT ruihe stereovisualodometryposecorrectionthroughunsuperviseddeeplearning AT zhipengbao stereovisualodometryposecorrectionthroughunsuperviseddeeplearning
_version_	1721285988442963968

Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning

Similar Items