Gesture-Based Computing

futurelab default header

What makes something interesting? It’s an intriguing question. According to research by Paul Silvia (Univ of North Carolina Dept of Psychology) it’s a cluster of variables around novelty-complexity, combined with a cluster of comprehension variables. In simple language, in order to be interesting, something has to be both new and understandable.

I think one of the smart things about the iPad is that it does enough to be genuinely different and new, without being so revolutionary that people don’t get it. Touchscreen already feels like a perfectly intuitive way of interacting with content. And it’s not difficult to see where this is headed – interacting with content without having to touch anything. Apple have already been granted a patent for a proximity-sensing touchscreen that could detect when an object (like a finger) is close to the screen but not touching it and then offer up context-dependent controls. Applications like Text 2.0 "allow text to know how and when it is being read". Perhaps it won’t be so long before Minority Report style gestural user-interfaces are a reality. They are, as Faris noted at the end of last year, the future of human computer interaction.

As this New Scientist piece points out, articulated hand-tracking systems have rarely been deployed in consumer applications because of their prohibitive expense and complexity. Until now, those that have been developed for gaming (like Microsoft’s Project Natal (or Kinect as it’s now been renamed) and Sony’s ICU) have only captured broad movement but not the detailed movement of hands, making it difficult to manipulate virtual objects. Until now.

Robert Wang, at MIT’s artificial intelligence lab, has developed a system that means gesture-based computing requires nothing more than a multicoloured glove, a web-cam and a laptop.

The game changer here is that instead of using prohibitively expensive and complex motion capture systems incorporating sensors placed around the body (like those used in Hollywood special FX) his system uses the computer’s web-cam to identify a hand position from a database of 100,000 pre-stored images. Once it finds a match it displays it on screen, and repeats this several times per second enabling it to recreate gestures in real time. A similar system, developed by Javier Romero and Danica Kragic of the Royal Institute of Technology in Stockholm, is attempting to do the same thing using your hand’s flesh tones, meaning you don’t even have to wear a glove at all. Perhaps this will be the basis for the system that enables gestural UI for the masses. An application that is cheap and simple. Genuinely different and new, yet intuitive to use. We’ve all seen the future. Maybe it isn’t as far away as we think.

***UPDATE***

justonlyjohn also kindly pointed me at Oblong – developer of the "g speak spatial operating environment"

g-speak overview 1828121108 from john underkoffler on Vimeo.

Original Post: http://neilperkin.typepad.com/only_dead_fish/2010/06/gesturebased-computing.html