Summary: | Simultaneous localisation and mapping (SLAM) based on computer vision has remarkably matured over the past few years, and is now rapidly transitioning into practical applications such as autonomous vehicles, drones, augmented reality (AR) / virtual reality (VR) devices, and service robots to name a few. These real-time, real-world SLAM applications require instantaneous reaction to dynamic, real-world environments, the ability to operate in scenes which contain extreme lighting variation, and high power efficiency. The standard video cameras on which they rely, however, run into problems when trying to supply these, due to either huge bandwidth requirements and power consumption at high frame-rates, or diminishing image quality with blur, noise or saturation. The core work of this research concerns these constraints imposed by standard cameras, and was motivated by silicon retinas in neuromorphic engineering mimicking some of the superior properties of human vision. One such bio-inspired imaging sensor called an event camera offers a breakthrough new paradigm for real-time vision, with its high measurement rate, low latency, high dynamic range, and low data rate properties. The event camera outputs not a sequence of video frames like a standard camera, but a stream of asynchronous events at microsecond resolution, indicating when individual pixels record log intensity changes of a pre-set threshold size. But it has proven very challenging to use this novel sensor in most computer vision problems, because it is not possible to apply standard computer vision techniques, which require synchronous intensity information, to its fundamentally different visual measurements. In this thesis, we show for the first time that an event stream, with no additional sensing, can be used to track accurate camera rotation while building a persistent and high quality mosaic of an unstructured scene which is super-resolution accurate and has high dynamic range. We also present the first algorithm provably able to track a general 6D motion along with reconstruction of arbitrary structure including its intensity and the reconstruction of grayscale video that exclusively relies on event camera data. All of the methods operate on an event-by-event basis in real-time and are based on probabilistic filtering to maximise update rates with low latency. Through experimental results, we show that extremely rapid motion tracking and high dynamic range scene reconstruction without motion blur can be achievable by harnessing the superior properties of the event camera, and the potential for consuming far less power based on the low data rate property. We hope that this work opens up the door to practical solutions to the current limitations of real-world visual SLAM applications.
|