Vision-Based Tracking with Dynamic Structure Light
Abstract
The term virtual reality (VR) has been used in the popular press to describe systems
that completely replace the user's view with computer graphics, a view of a synthetic
world. Such systems allow building designers to visualize the "completed" building
and potential problems in the design before construction. A VR system might allow
scientists to view environments as large as galaxies or as small as atoms as if they were
human-sized. The primary benefit in these cases would be to interact in more natural
ways with objects that, due to their sizes, are usually beyond our reach.
In academic literature, these systems are known as virtual environment (VE) systems.
Three hardware components comprise such systems.
I. a display subsystem through which the user will look into the synthetic world,
often a head-mounted display (HMO)
2. an image generation subsystem that can paint the proper image onto this display
3. a tracking subsystem that can determine the user's viewpoint and view direction
in order to paint the correct picture
The goals are to present a new world that is "real" to the user's senses and to give the
user a sense of presence in that environment. "Real" in this sense implies in that the
environment exhibits a consistent, believable appearance and behavior.
A variant on this idea is to not completely replace the user's view of the surrounding
environment, but merely to add to it. The term augmented reality (AR) has been given to
applications that merge computer graphics with images of the real world [Caudell92].
Since the user sees real objects in the environment, the sense of immersion changes.
However, the merging of synthetic objects with real imagery still requires a consistent,
believable depiction of the appearance and behavior of the synthetic objects.
Synthetic objects must behave as their real counterparts would. They must be situated
in-and stay in-the proper place as the user moves through the environment. In
other words, they must be properly aligned with the real objects. They must disappear
when another object (real or synthetic) obstructs the user's view. These are two
of the most difficult tasks in merging real and synthetic imagery. For most AR systems,
achieving alignment reduces to acquiring accurate tracking data of the user's viewpoint
and view direction and properly displaying occlusion between real and synthetic objects
reduces to knowing the relative distances of the real and synthetic objects from the user.
Achieving alignment and depicting occlusion have been difficult problems to solve.
Most proposed solutions have either limited success or a limited domain of application.
The major motivation behind this research was to create a system that could achieve
alignment without restricting the user in the ways that previous approaches had. We
were able to do this in a framework that provides a good basis from which to solve the
problem of correctly depicting occlusion relationships as well.