DyPERS: Dynamic Personal Enhanced Reality System
Nuria Oliver Tony Jebara Bernt
Schiele Alex Pentland
Abstract
DyPERS is a 'Dynamic Personal Enhanced Reality System' which uses augmented
reality and computer vision to overlay video and audio clips relevant for the
user on top on real real objects that the user is paying attention to. The
system is wearable and adaptively learns an audio and video memory and what
everyday objects to associate it with and to evoke or playback in the future.
Introduction
DyPERS is a 'Dynamic Personal Enhanced Reality System' which uses augmented
reality and computer vision to overlay video and audio clips relevant for the
user on top on real real objects that the user is paying attention to. The user
wears a HUD (Heads-Up Display) with a small mounted ELMO CCD QN401E color camera
on board (as in the
Stochasticks
system) and a wireless microphone. A
generic and
trainable object recognition system
processes images from the camera as the
user turns his head to view an object of interest. It then automatically
highlights important objects as previously specified by the user. The user shows
the system objects of interest ahead of time and then will associate video and
audio clips that the user records to those objects. DyPERS can be considered a
videographic memory.
System Architecture
The three main components of DyPERS are:
-
Audio visual memory, which accumulates personal memories and
associates them to objects
-
Generic trainable object recognition system using computer vision as
input
-
Wearable system with audio visual input/output capabilities and
interface.
Audio Visual Memory
Audio and Video are recorded and played back in real time as the following
images show:
Visual Learning
The generic object recognition system uses computer vision. It is invariant
to scaling, translation, rotation, small lighting changes and deformations of
the object so that it can usually be recognized in different situations (see
figure below). Some of its features are:
- Learns user-relevant visual cues
- Builds a statistical representation of the objects
-
Groups sample images into objects and several objects into a audio-visual
association
Recognition of the tie
Hardware
The system is full wearable but needs offline processing which is done via
wireless links. Its important hardware components are:
- ELMO CCD QN401E color camera
- Wireless 3-button mouse
- Wireless microphone
- Glasstron heads-up display and headphones
- SGI O2
- Wavecom Jr transmitter/receiver units
Interface
The interface paradigm is Record and Associate: just by using two
buttons of a wireless mouse the user selects when to record some video and audio
in real time (see Figure below). A third button (garbage) is for negative
feedback to signal to the system that the association it learned is incorrect
and to delete it.
Applications
Some examples of objects or situations that DyPERS can recognize and augment
are:
-
Clock: Every time DyPERS recognizes a wall clock or the watch of the
user it displays a video with the user's schedule for the day.
-
Demo poster: Just by looking at a poster of a demo of the lab, DyPERS
will show a short video and audio with the demo associated to the poster.
-
Multilingual Teacher: The user could record the name of several
objects in several languages. During recognition DyPERS would speak the name of
the learnt objects in the different languages, teaching the user how to say
them.
-
Stuffed Animal: A story about the animal could be recorded in such a
way that everytime the kid looks at the animal DyPERS triggers the story
associated to it.
-
Augmented Storyteller: The parents could associate the pictures on
each page of their children's book with the story about them. Later the kid
would listen to the story just by looking at the pages of the book.
-
KeyPad Door Lock: By recording the keypad door combination with an
image of the keypad, DyPERS would remind the user of the right combination when
the user would look at the keypad.
-
Business Card: A video and audio clip of our conversation with an
important person would be associated with his or her business card. Then
everytime the user would look at the business card the video and audio would
appear and remind the user about whom the business card belongs to.
-
Origami: DyPERS could teach the user how to create different origami
objects by playing video and audio about how to make them.
-
Specific Machinery: Instructions about how to change pieces of some
appliance, or how to use it would be associated with the appliance itself and
played back when looking at the object. For example, instructions on how to
change the ink cartridges of the printer and the printer.
-
CD Cover or Poster of a Movie: The CD cover of some music or the
poster of a movie would be associated with some clips of the music contained on
the CD or a small preview of the movie. When the user would look at the CD cover
or at the poster DyPERS would play the audio and video associated with it, given
to the user a good cue of what type of music or movie is.
-
Blind People: Important objects could be associated with some audio
information in such a way that DyPERS could describe what visually impaired
persons are seeing in a personalized and private way.
-
Medicine: The user could record the directions of the doctor about
how to take a specific medicine with the box/containter of it. When necessary
just by looking at the medicine DyPERS would play the instructions automatically
to the user.
-
Name/Logo of a specific store, associated with the nearest location
of it, its schedule and which items the user would be interested in purchasing
there.
-
Art objects (images, paintings, sculptures) and some explanationsn
about them, the location and other relevant information.
The output consists of superimposesd video and audio on the real video
images.
The system is dynamic, personal and trainable.
Video
Bibliography:
-
"DyPERS: Dynamic Personal Enhanced Reality
System"
. Bernt Schiele, Nuria Oliver, Tony Jebara and Alex Pentland. Intl. Conference
on Computer Vision Systems, ICVS'99 Jan1999. Gran Canaria, Spain.
-
"Sensory Augmented Computing: Wearing the Museum's
Guide"
.
Bernt Schiele, Tony Jebara and Nuria Oliver. IEEE Micro Journal.
2001.
-
"Stochasticks:Augmenting the Billiards Experience with Probabilistic
Vision and Wearable Computers"
. Tony Jebara, Cyrus Eyster, Josh Weaver,
Thad Starner and Alex Pentland. Proc. of the First Intl. Symposium on Wearable
Computers. Oct 1997. Cambridge, MA.
-
"Object Recognition using multidimensional receptive field histograms."
Bernt Schiele. PhD thesis. July 97. I.N.P. Grenoble. France