hp.com home

Michael Harville

printable versionprintable version



Content starts here

Content starts here

Michael Harville

Research Scientist


2/15/2012: This is over 4 years old, and will be completely updated within the next week.

Please come back to see it in a few days!!!

Statement of Purpose/Guiding Vision
This document helps me justify what I spend my time doing all day, and maybe you'll find that it does the same for you. If so, feel free to point a link at it!




Research Interests
Some of these are more on the back burner than others, but I enjoy discussing and thinking about all of it. 

  • Computer vision for perceptive computing and natural human-computer interfaces 
    • Person detection, tracking, recognition
    • Head and body pose analysis
      • Movie (4.5MB) of real-time body pose classification with one stereo camera. The labels of pose and body orientation were automatically generated by the vision system in real-time.
      • Movie (2MB) of real-time body orientation estimation with one stereo camera. The body orientation vector was generated automatically in real-time.
      • CVPR 2004 paper integrating pose recognition and location tracking
      • Movie of real-time, 3D head orientation and location tracking, using direct motion estimation in range and luminance images
        • ICCV'99 paper introducing the theory of the Depth Change Constraint Equation (DCCE), analogous to classic BCCE, for direct motion estimation in range images
        • This work has been extended (a lot) to create the MIT Watson head tracker
      • CVPR 2000 paper on real-time articulated body tracking using "twists", in color and depth
    • Face finding and tracking
      • IJCV paper on the first real-time, vision-based multi-face tracker there ever was (publicly demoed in 1998)
      • Mass Hallucinations art piece, shown at CVPR'98, Siggraph '97, Siggraph '98, and San Jose Tech Museum
    • Gesture recognition
  • Image and video segmentation
    • Modality combination for segmentation
    • Adaptive background modeling and removal
      • Paper on real-time, adaptive foreground segmentation in color and depth
      • Paper on generalized framework for using high-level feedback to guide TAPPMOG (Time-Adaptive Per-Pixel Mixture-Of-Gaussian) background models for real-time foreground segmentation
    • Dense depth computation from stereo cameras
  • Projector-camera systems

      Curved screen and four projectors, showing football game

Code-named "Panoply", this project developed technology for building large, high-resolution, seamless, curved displays automatically from multiple, cheap projectors. Many applications, including home entertainment, gaming, scientific visualization, videoconferencing, etc.

    • Video from Consumer Electronics Show (CES) 2007, where we demoed three Panoply screens with motion chairs in the Voodoo PC booth to create fun gaming demos (see more below). Note: interaction of projector color wheel and video camera frame rate causes strange color shift artifacts that you do not see in person.)
    • Playing Quake4 game on Panoply at HP Gaming media event - cooooool :)
    • Movie of example 4-projector display, showing six movies at the same time. (Sorry, we really ought to make a better video than this - it looks better in person!)
    • "Panoply" everywhere! : An immersive gaming demo shown in 2007 at CES, NAB, D5, the Game Developer's Conference, and an HP Gaming Summit press event, among other places. Consists of:
      • 2-projector tiled display on curved surface with 8-foot arc
      • D-Box GP-100 motion chair that tilts, jumps, and vibrates along with game action
      • rFactor race car driving game, and/or Lunar Racing games
      • Voodoo Omen PC - the Ferrari of computers; world's fastest desktop PC

Tons of fun - the closest most of us will ever get to being in a race car! Many articles and videos online, including:


    • ProCams 2006 paper on tiling projectors on curved screens, with proper geometric and photometric correction
  • Audiovisual media analysis


    • Automated cosmetics consulting: Trying to find the right shade of makeup? Just take a picture of yourself with your cellphone, send it via MMS to our website, and we'll text you back what shade to buy! I did much of the image processing for this, and put together the first prototypes.
    • Television commercial detection. See US Patent 6,993,245. Did this in 1999, but never had time to write a paper on it.
    • High-level event detection from multimodal (audio/video/closed-caption) analysis
    • Content recognition for copyright enforcement

Capturing full path (location+time) data along with your media, for enhanced reminiscing, search, and organization

    • Slideshow of some of the applications, architecture, data structures, and algorithms
    • Example applications:
      • Automatic generation of movie-like presentations of your trips, with your captured audio/video/images being played as icon representing you moves around on a map along the path you traveled. See movie (18MB, wmv) showing a mock-up of the concept, for a San Francisco trip.
      • Fill the gaps in your trip diary, by comparing your path to geo-tagged image databases and searching for photos of interesting places you visited but "missed the shot". Perhaps you forgot to take a picture (too busy having fun) or were unable to take one ("no cameras allowed"), or your photos were poor (bad lighting, timing, weather, etc.). You might even find an image taken at the same place at the same time! 
      • Find yourself in other people's photos: your path is compared to the locations at which others took photos. Camera orientation data is also helpful.
      • Use your path to find the name of that great hotel, restaurant, or other point of interest you visited, even if you did not photograph it.
      • Take no photos at all on your trip, but use the captured path data to automatically generate a multimedia trip presentation by searching for other people's geo-tagged media at locations along your path.
    • Application domains: Personal remembrance, photo sharing, real estate "house-hunting", vacation package tour supplement,  documentation of scientific exploration expeditions,  search and rescue, military reconnaisance.
  • Intelligent environments: design and architecture
  • Network-based media services architectures and applications

10th Bay Area Vision Meeting, co-organized by Donald Tanguay and myself, at HP in Palo Alto on March 4th, 2004.

You can reach me at:




(650) 857-3575


(650) 852-3791 


Hewlett Packard Laboratories 
1501 Page Mill Road, ms 1181 
Palo Alto, CA 94304 



Vision and Graphics Project


Mobile and Media Systems Lab (MMSL)





Vision and Graphics Reading Group