Feeds

IBM's megabrain Watson to make mobe, slab apps smarter? Not so fast

We drill into Big Blue's dream to put data-muncher monster in your palm

Remote control for virtualized desktops

Analysis IBM wants developers to build smartphone apps that use Big Blue's clever Jeopardy!-beating Watson software.

But harnessing the TV star's silicon brain will require more than just invoking a few API calls with JSON: the app programmers will have to do a lot heavy lifting themselves to train Watson.

The Watson Mobile Developer Challenge was fluffed by IBM in a press release on Wednesday and during a speech by chief executive Virginia Rometty at Mobile World Congress.

It gives programmers a chance to compete over the next three months to come up with ideas for "cognitive computing" applications that use Watson's capabilities, and successful ones will be paired with IBM's Interactive Experience Group to help them develop a viable commercial product.

For now, we'll put aside the fact that most other app competitions involve the winner getting cold hard cash – Salesforce shelled out $1 million, for instance. Instead, we'll delve into some of the technical issues that will make developing for Watson a new and sometimes frustrating experience.

IBM claims that Watson "processes information akin to how people think." This is half right – Watson constructs an internal model from the data you throw at it to understand, but training Watson to deal with that information takes a long time, and it is still quite brittle.

Watson's fundamental technology is a decision engine that is able to analyze and answer questions about data loaded into it, such as what symptoms may be indicative of certain cancers, or the correct financial product to recommend to someone given their situation. It is an immensely powerful technology and represents years of research by IBM and academics.

The catch, as highlighted by El Reg, is that this approach requires developers loading a large amount of data into Watson's underlying Hadoop and Apache UIMA-based "DeepQA" analysis engine. Watson then needs to be trained on the data to allow it to develop an appropriate mental model of the information. This takes time, and limits the range of apps that can be built on the system.

"The way that training occurs is through an iterative process, very much like school," explained IBM Watson veep Stephen Gold, in a chat with El Reg.

"How long it takes is a byproduct of the actual use case. If I'm teaching basic arithmetic, the process moves very quickly and I can answer questions in a very short period of time. If I want the system to be able to perform advanced differential equations, I know I need to build through an advanced set of learnings."

'We don't have a lot of partners who want to boil the ocean'

Although new datasets take a while to integrate, once this is complete, related material can be added in a shorter timeframe: when Watson was first put to work analyzing cancer, it took a year to fully integrate information involving lung cancer, but then only took six months to add breast cancer, and three months to add in colon cancer, Gold explained.

So, when IBM said it hopes developers will build apps for Watson, it's worth pointing out that if IBM hasn't stored the exact data the developers would like to drill into, the developers will need to work with IBM to get that knowledge into Watson.

This could take "between weeks and months," Gold said, before pointing out that "most of the apps we see are not nearly as complex as cancer. Most of them have a finite information they're working with [such as] product manuals."

In the short term, Watson will likely be an amazing technology for apps that require a decision-making capability, but we're a long, long way from the types of general intelligence models that would turn Watson from a sophisticated Fabergé egg into a tech of broad utility.

"To get to a general application, what you'd have to do is have enough experience and time to train Watson on all things possible," Gold explained. "What we find with applications, they are very purposeful in the [particular] problem they are trying to solve. We don't have a lot of partners who want to boil the ocean."

Watson will "continue to get smarter," he said, but as IBM tries to integrate more and more data into a single model it will start to run into a problem that even human adults have trouble solving – dealing with contradicting data.

"It's not so much about the volume [of data]," Gold explained. "If the veracity is high of that information and is uncontroverted, it could be a terabyte and you could train Watson quickly, but where the veracity is in question or sources of evidence are contradictory, Watson needs to iterate through not only the training but also the use... there's a lot more to do not so much with the number of sources of data, it's how much the data that's being collected is conflicted with itself."

By example, if Watson had been fed a full dataset from the 1400s that stated unequivocally the world was flat, it would take it some time to adjust to new data coming in that stated the world was round, but adjust it would. This is fundamentally different to how current computers work and is a laudable, fascinating bit of technology. It is not, however, easy or simple or trivial, so developers keen to develop Watson apps will need to be very specific at first about the types of data they want to draw on. We wish them the best of luck in grappling with Big Blue's Big Brain. ®

Secure remote control for conventional and virtual desktops

More from The Register

next story
NSA SOURCE CODE LEAK: Information slurp tools to appear online
Now you can run your own intelligence agency
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
Stop the IoT revolution! We need to figure out packet sizes first
Researchers test 802.15.4 and find we know nuh-think! about large scale sensor network ops
Turnbull should spare us all airline-magazine-grade cloud hype
Box-hugger is not a dirty word, Minister. Box-huggers make the cloud WORK
SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016
Flash WORM has a serious use for archived photos and videos
Astro-boffins start opening universe simulation data
Got a supercomputer? Want to simulate a universe? Here you go
Microsoft adds video offering to Office 365. Oh NOES, you'll need Adobe Flash
Lovely presentations... but not on your Flash-hating mobe
prev story

Whitepapers

Free virtual appliance for wire data analytics
The ExtraHop Discovery Edition is a free virtual appliance will help you to discover the performance of your applications across the network, web, VDI, database, and storage tiers.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
10 threats to successful enterprise endpoint backup
10 threats to a successful backup including issues with BYOD, slow backups and ineffective security.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Website security in corporate America
Find out how you rank among other IT managers testing your website's vulnerabilities.