Perceptual user interfaces promise modes of fluid computer-human interaction that complement the mouse and keyboard, and are well motivated in non-desktop scenarios (e.g. media center control). Such interfaces have been slow to catch on for a variety of reasons, including the computational burden they impose, a lack of robustness outside the laboratory, unreasonable calibration demands, and a shortage of sufficiently compelling applications. We address these difficulties with a fast stereo vision algorithm for interacting with the computer at a distance. The system uses two inexpensive video cameras to extract depth information. This depth information enhances automatic object detection and tracking robustness, and may also be used in applications. We demonstrate the algorithm in combination with speech recognition to perform various window management tasks, show preliminary user study results in the use of the system, and discuss the implications of such a system on tomorrow's user interfaces.
Some snapshots of GWindows in action, from top to bottom, left to right: stereo vision interface, GWindows being used with 3 monitors, moving windows with your hand and ink-drawing with your hand
GWindows: Towards Robust Perception-Based UI, Andy Wilson & Nuria Oliver, To appear in Proc. of CVPR 2003 (Workshop on 'Computer Vision for HCI')
Video that illustrates how GWindows works:
High Resolution Version (1.5Mgbs)
Low Resolution Version (384Kbps)