Feeds

Intel demos next-generation voice and gesture interfaces

Offers a million bucks for the best 'perceptual computing' idea

Boost IT visibility and business value

IDF 2012 Intel wants computers to be as smart as humans in how they understand voices and gestures – and it's offering $1m to the best idea that can help achieve that goal.

"Human beings are very rich in the way that they interface with each other, the way they interact with each other," David Perlmutter, the general manager of Intel's Architecture Group told his audience at his opening keynote of the Intel Developer Forum on Tuesday in San Francisco.

A principle part of that interaction, of course, is voice, and Perlmutter introduced a demo of a Dell XPS 13 Ultrabook running a beta version of Dragon Assistant by Nuance, which he said would be released in a public beta in the fourth quarter of this year and in production by the first quarter of 2013.

In the demo, Dragon Assistant responded to simple voice requests, such as being asked to search for pictures of San Francisco on Google, as well as performing more-complex activities such as looking for sunglasses on Amazon then sharing the link of the Amazon results page, along with a voice-to-text Twitter message asking followers for suggestions.

In each case, Dragon Assistant's female voice identified – Siri-like – the search results. Interestingly, the demo showed how the voice-recognition technology could also correctly understand poor grammar and relatively poor pronunciation of foreign song titles when asked to play specific tunes.

But people don't just communicate by voice, Perlmutter said. "They don't just use voice, they don't just use handwriting, they don't just use touch," he said. "They use gestures: handshake gestures, hand gestures, finger gestures."

Gestural interaction was demoed using a compact USB-powered Creative 3D camera coupled with SoftKinetic's 3D gesture-recognition middleware. In its first iteration, Perlmutter said, the camera will be a separate unit mounted on top of, say, a laptop display, but that 3D-recognition cameras of the future will be integrated into laptops – and, one assumes, Ultrabooks.

Intel's David Perlmutter demonstrating gestural recognition at the Intel Developers Forum

Perlmutter: Each finger has a role in gestural recognition

The demo showed that the camera and software has the capability of recognizing not only large gestures, but finger gestures as well. Such recognition capabilities, Perlmutter said, were "just the beginning" of gestural interaction.

At Intel Labs, he said, there's work being done on the ability to play virtual catch with virtual objects. "I thought about what will happen if I had all these virtual objects," Perlmutter said, "and I can have a discussion with Skype or whatever other video-conferencing capability with my granddaughter – I will be able to play with her across the ocean."

To grease the skids of what the company calls "perceptual computing" – touch, voice, fine-grained gesture recognition, facial and object recognition, and other modes of human-computing interaction – Intel will soon make available a perceptual computing SDK for use with Creative's Interactive Gesture Camera Developer Kit, including a 3D HD camera that has the ability to interpret gestures between roughly 6 inches and 3 feet.

In addition, Intel will host a Perceptural Computing Challenge with a prize of $1m in awards and promotions to the best submission. The contest will go live in the fourth quarter of this year, and will debut on Intel's Perceptual Computing website.

Perlmutter also offered an idea of what he hopes will come after perceptual computing. In addition to being able to understand your gestures, computers should also be able to figure out what you want without you having to be thoroughly specific in your requests, he said. "I call it my 'wife dream'. I'll figure out and guess what she wants and be ready to go when she just wants it."

Computers should be able to do the same, he said. But first they need to respond to your touch, do what you tell them to with your voice, and respond to your gestures, whether those gestures be as broad as those made by your entire body, or as subtle as a wiggle of your finger. ®

Build a business case: developing custom apps

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Linux turns 23 and Linus Torvalds celebrates as only he can
No, not with swearing, but by controlling the release cycle
China hopes home-grown OS will oust Microsoft
Doesn't much like Apple or Google, either
Sin COS to tan Windows? Chinese operating system to debut in autumn – report
Development alliance working on desktop, mobe software
Apple promises to lift Curse of the Drained iPhone 5 Battery
Have you tried turning it off and...? Never mind, here's a replacement
Eat up Martha! Microsoft slings handwriting recog into OneNote on Android
Freehand input on non-Windows kit for the first time
Linux kernel devs made to finger their dongles before contributing code
Two-factor auth enabled for Kernel.org repositories
This is how I set about making a fortune with my own startup
Would you leave your well-paid job to chase your dream?
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Scale data protection with your virtual environment
To scale at the rate of virtualization growth, data protection solutions need to adopt new capabilities and simplify current features.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?