Summary: | Real-time, virtual and mixed reality systems have diverse uses for real-world data visualization, representation, and remote collaboration in distant learning settings, especially in universities. Design of such systems involves challenges in mapping the real world data and physical world structure accurately to digital form of physical space, also called as virtual models. Researchers have created similar systems using multiple cameras, stereo cameras, accelerometers, and motion detectors. This report presents a platform to detect and track real-time locations of people present in buildings and map their location information into virtual models as avatars using omni-directional cameras installed in the physical space. These models were created as part of the Mirror Worlds project. The project infrastructure is funded by National Science Foundation. This infrastructure enables users to connect virtual and physical aspects of the environment through a coordinate-based data networking system to enable interaction with the rest of the system including environment objects and other users. This is an interdisciplinary project where students from various departments have worked on the development of virtual model of the Moss Art Center and Torgersen Hall in Unity / X3D. Some students from the Department of Computer Science have developed a coordinate-based data networking system. The prototype of a detection and tracking algorithm to extract the location information was developed using background subtraction in MATLAB.
The proposed approach was developed using the combination of background subtraction and neural networks along with heuristics based on spatial information about the physical space. The system was scaled to work across multiple buildings, extract the location information of people present in the physical space, and map location information into shared virtual space as an avatar. The concept of remote presence was extended to create a collaborative object manipulation application using Leap Motion controller. Effects of fidelity were evaluated to perform the collaborative object manipulation task in shared virtual space based on user study conducted for this application.
Since no annotated people video dataset is publicly available with overhead view from omni-directional cameras, three videos were annotated manually to test the performance of the approach. The current approach almost works at near real-time rates. All three video sequences were evaluated to compute frame based detection accuracy. Precision and recall obtained for the first video sequence of people detection is 93.85% and 95.06% respectively. === Master of Science
|