Summary: | Estimation of distance from objects in real-world scenes is an important topic in several applications such as navigation of autonomous robots, simultaneous localization and mapping (SLAM), and augmented reality (AR). Even though there is a technology for this purpose, in some cases, this technology has some disadvantages. For example, GPS systems are susceptible to interference, especially in places surrounded by buildings, under bridges or indoors; alternatively, RGBD sensors can be used, but they are expensive, and their operational range is limited. Monocular vision is a low-cost suitable alternative that can be used indoor and outdoor. However, monocular odometry is challenging because the object location can be known up a scale factor. Moreover, when objects are moving, it is necessary to estimate the location from consecutive images accumulating error. This paper introduces a new method to compute the distance from a single image of the desired object, with known dimensions, captured with a monocular calibrated vision system. This method is less restrictive than other proposals in the state-of-the-art literature. For the detection of interest points, a Region-based Convolutional Neural Network combined with a corner detector were used. The proposed method was tested on a standard dataset and images acquired by a low-cost and low-resolution webcam, under noncontrolled conditions. The system was tested and compared with a calibrated stereo vision system. Results showed the similar performance of both systems, but the monocular system accomplished the task in less time.
|