Feeds

IBM's megabrain Watson to make mobe, slab apps smarter? Not so fast

We drill into Big Blue's dream to put data-muncher monster in your palm

Designing a Defense for Mobile Applications

Analysis IBM wants developers to build smartphone apps that use Big Blue's clever Jeopardy!-beating Watson software.

But harnessing the TV star's silicon brain will require more than just invoking a few API calls with JSON: the app programmers will have to do a lot heavy lifting themselves to train Watson.

The Watson Mobile Developer Challenge was fluffed by IBM in a press release on Wednesday and during a speech by chief executive Virginia Rometty at Mobile World Congress.

It gives programmers a chance to compete over the next three months to come up with ideas for "cognitive computing" applications that use Watson's capabilities, and successful ones will be paired with IBM's Interactive Experience Group to help them develop a viable commercial product.

For now, we'll put aside the fact that most other app competitions involve the winner getting cold hard cash – Salesforce shelled out $1 million, for instance. Instead, we'll delve into some of the technical issues that will make developing for Watson a new and sometimes frustrating experience.

IBM claims that Watson "processes information akin to how people think." This is half right – Watson constructs an internal model from the data you throw at it to understand, but training Watson to deal with that information takes a long time, and it is still quite brittle.

Watson's fundamental technology is a decision engine that is able to analyze and answer questions about data loaded into it, such as what symptoms may be indicative of certain cancers, or the correct financial product to recommend to someone given their situation. It is an immensely powerful technology and represents years of research by IBM and academics.

The catch, as highlighted by El Reg, is that this approach requires developers loading a large amount of data into Watson's underlying Hadoop and Apache UIMA-based "DeepQA" analysis engine. Watson then needs to be trained on the data to allow it to develop an appropriate mental model of the information. This takes time, and limits the range of apps that can be built on the system.

"The way that training occurs is through an iterative process, very much like school," explained IBM Watson veep Stephen Gold, in a chat with El Reg.

"How long it takes is a byproduct of the actual use case. If I'm teaching basic arithmetic, the process moves very quickly and I can answer questions in a very short period of time. If I want the system to be able to perform advanced differential equations, I know I need to build through an advanced set of learnings."

'We don't have a lot of partners who want to boil the ocean'

Although new datasets take a while to integrate, once this is complete, related material can be added in a shorter timeframe: when Watson was first put to work analyzing cancer, it took a year to fully integrate information involving lung cancer, but then only took six months to add breast cancer, and three months to add in colon cancer, Gold explained.

So, when IBM said it hopes developers will build apps for Watson, it's worth pointing out that if IBM hasn't stored the exact data the developers would like to drill into, the developers will need to work with IBM to get that knowledge into Watson.

This could take "between weeks and months," Gold said, before pointing out that "most of the apps we see are not nearly as complex as cancer. Most of them have a finite information they're working with [such as] product manuals."

In the short term, Watson will likely be an amazing technology for apps that require a decision-making capability, but we're a long, long way from the types of general intelligence models that would turn Watson from a sophisticated Fabergé egg into a tech of broad utility.

"To get to a general application, what you'd have to do is have enough experience and time to train Watson on all things possible," Gold explained. "What we find with applications, they are very purposeful in the [particular] problem they are trying to solve. We don't have a lot of partners who want to boil the ocean."

Watson will "continue to get smarter," he said, but as IBM tries to integrate more and more data into a single model it will start to run into a problem that even human adults have trouble solving – dealing with contradicting data.

"It's not so much about the volume [of data]," Gold explained. "If the veracity is high of that information and is uncontroverted, it could be a terabyte and you could train Watson quickly, but where the veracity is in question or sources of evidence are contradictory, Watson needs to iterate through not only the training but also the use... there's a lot more to do not so much with the number of sources of data, it's how much the data that's being collected is conflicted with itself."

By example, if Watson had been fed a full dataset from the 1400s that stated unequivocally the world was flat, it would take it some time to adjust to new data coming in that stated the world was round, but adjust it would. This is fundamentally different to how current computers work and is a laudable, fascinating bit of technology. It is not, however, easy or simple or trivial, so developers keen to develop Watson apps will need to be very specific at first about the types of data they want to draw on. We wish them the best of luck in grappling with Big Blue's Big Brain. ®

The Power of One eBook: Top reasons to choose HP BladeSystem

More from The Register

next story
Apple fanbois SCREAM as update BRICKS their Macbook Airs
Ragegasm spills over as firmware upgrade kills machines
Attack of the clones: Oracle's latest Red Hat Linux lookalike arrives
Oracle's Linux boss says Larry's Linux isn't just for Oracle apps anymore
THUD! WD plonks down SIX TERABYTE 'consumer NAS' fatboy
Now that's a LOT of porn or pirated movies. Or, you know, other consumer stuff
EU's top data cops to meet Google, Microsoft et al over 'right to be forgotten'
Plan to hammer out 'coherent' guidelines. Good luck chaps!
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
FLAPE – the next BIG THING in storage
Find cold data with flash, transmit it from tape
prev story

Whitepapers

Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Reducing security risks from open source software
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.