Summary: | 碩士 === 國立臺灣大學 === 電機工程學研究所 === 104 === Three-dimensional environment reconstruction from a monocular camera has been a popular and a challenge research topic in past few years. This technique can be applied to unmanned vehicles to perform automatic navigation, environment exploration and automatic obstacle avoidance. In addition, it can also be applied to augmented reality. Since the camera is not equipped with an inertial measurement unit (IMU), it is necessary to locate the camera position and map the environment simultaneously. In this thesis, the camera pose estimation is based on feature based method [24: Lepetit et al. 2009] and direct method [1: Engel et al. 2014]. The camera localization thread is depend on the semi-dense map which is the high gradient area in image and is easily to become noisy. Hence, a method that can regularize the reconstructed semi-dense map without affect the accuracy of the camera pose localization is proposed in this thesis. The regularization method can eliminate the noise and smooth the semi-dense map. Furthermore, the regularization method is related to the photometric information between two images, unlike other methods only using the information of the depth and spatial relation. The reconstruction algorithm can be divide into three parts: stereo matching, piecewise planar constraint, and plane optimization. Since the high gradient areas are always narrow and hard to apply the piecewise planar constraint, a stereo matching method that can broaden the high gradient area by using their nearby low gradient pixels is proposed. After the semi-dense map is reconstructed, the semi-dense map will propagate to the piecewise planar constraint which can estimate the initial piecewise planes for each pixel. Finally, the optimization method is applied to optimize each estimated piecewise plane.
In this thesis, the proposed stereo matching is composed of prior depth of ORB feature [27: Rublee et al. 2011], KD-Tree [36: Bentley 1975], Priority Queue and the entropy of the histogram of oriented gradient. The aim is to match the low gradient area around the high gradient area between two images correctly by using the epipolar geometry. It is hard to match two textureless areas between two images, so the best nearby texture area is searched to do the matching procedure. Firstly, if one pixel does not hold an inverse depth hypothesis, the nearby ORB features which has initial depth knowledge is used to initiate the inverse depth value, which can shorter the epipolar line searching length and improve the accuracy of the matching result. Searching the texture area which contains high gradient pixel is done by using k nearest neighbor search with KD-Tree, and sorting the searched pixels in accordance with the gradient magnitude by the priority queue. If the searched point passes the stereo searching constraint, the searched high gradient point will form a 5×5 pixels template and be used to do the stereo line searching. The corresponding points are considered to be matched if the residual between the templates in two image pass the stereo matching threshold which will change with the value of the searching region’s entropy of the histogram of oriented gradient.
In the regularization part of this thesis, each tiny piece of point cloud projected from the image in 3D coordinate is assumed to fit a plane. The corresponding size in the image of each piece is set to 5×5 pixels. Since the assumption will not hold if the piece is in the border between two different objects or the discontinuous area, the planar constraint is applied to discriminate the non-planar region. After passing the planar constraint, Gauss-Newton method is used to minimize the photometric error between the two patches which projected from the piece in 3D coordinate in two images and the optimal parameters of the plane can be obtained. Afterwards, the optimal parameters are used to eliminate the noises and smooth the point cloud. The experimental results demonstrate that the proposed regularization algorithm can eliminate most of the noises and reconstruct a more clearly point cloud.
|