System Design and Analysis for Creating a 3D Virtual Street Scene for Autonomous Vehicles using Geometric Proxies from a Single Video Camera

Self-driving vehicles use a variety of sensors to understand the environment they are in. In order to do so, they must accurately measure the distances and positions of the objects around them. A common representation of the environment around the vehicle is a 3D point cloud, or a set of 3D data poi...

Full description

Bibliographic Details
Main Author: Wong, Timothy
Format: Others
Published: DigitalCommons@CalPoly 2019
Subjects:
Online Access:https://digitalcommons.calpoly.edu/theses/2041
https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=3462&context=theses
Description
Summary:Self-driving vehicles use a variety of sensors to understand the environment they are in. In order to do so, they must accurately measure the distances and positions of the objects around them. A common representation of the environment around the vehicle is a 3D point cloud, or a set of 3D data points which represent the positions of objects in the real world relative to the car. However, while accurate and useful, these point clouds require large amounts of memory compared to other representations such as lightweight polygonal meshes. In addition, 3D point clouds can be difficult for a human to visually understand as the data points do not always form a naturally coherent object. This paper introduces a system to lower the memory consumption needed for the graphical representation of a virtual street environment. At this time, the proposed system takes in as input a single front-facing video. The system uses the video to retrieve still images of a scene which are then segmented to distinguish the different relevant objects, such as cars and stop signs. The system generates a corresponding virtual street scene and these key objects are visualized in the virtual world as low poly, or low resolution, models of the respective objects. This virtual 3D street environment is created to allow a remote operator to visualize the world that the car is traveling through. At this time, the virtual street includes geometric proxies for parallel parked cars in the form of lightweight polygonal meshes. These meshes are predefined, taking up less memory than a point cloud, which can be costly to transmit from the remote vehicle and potentially difficult for a remote human operator to understand. This paper contributes a design and analysis of an initial system for generating and placing these geometric proxies of parked cars in a virtual street environment from one input video. We discuss the limitations and measure the error for this system as well as reflect on future improvements.