Feeds

Googlers devise DeViSE: A thing-recognising FRANKENBRAIN

Machine-learning tech glues together image eyeballing and text grokking

3 Big data security analytics techniques

You'd think guesswork and advanced science would be natural enemies, but not at Google where a crack team of researchers are trying to mate the two together.

In a paper presented on Monday at an artificial-intelligence conference in California, seven Google researchers outlined their image classifier, software that labels pictures by identifying what's in them. It was created by fusing two distinct machine-learning approaches together.

In short, the system can make an educated guess at identifying an unfamiliar picture based on the text labels offered to it. For example, if it was shown a photo of a black Victorian top hat it hadn't seen before, and asked if it was a black Victorian top hat or a black pedal-opened wastepaper bin – both labels it also hadn't heard of before – it could guess correctly because it knows what various other hats and garbage bins look like and knows the relationships between their labels.

The DeViSE: A Deep Visual-Semantic Embedding Model paper [PDF] describes a tech that strives to combine the eerie image recognition capabilities of Google's traditional weak-AI systems with the broad semantic modeling capabilities of its "Skip-gram" text classifiers.

This approach is called "zero-shot learning", and is seen by the Google brain trust (which includes MapReduce-creator Jeff Dean) as one of the best chances of designing systems that can deal with changeable datasets with poor classifications – in other words, the info Google's growing fleet of handheld or wheel-bound electronic eyes are likely to slurp up from the world around them.

"The goals of this work are to develop a vision model that makes semantically relevant predictions even when it makes errors and generalizes to classes outside of its labeled training set," they write.

DeViSE contains two elements: a text classifier that labels text based on its contents, and an object recognizer that studies images.

The text classifier trains a neural language model using 5.7 million documents comprising 5.4 billion words slurped from Wikipedia. The approach lets the tech convert the fuzzy world of language into a numeric graph in which each word is defined by its relationships with others.

The image recognizer, meanwhile, is a "state-of-the-art deep neural network for visual object recognition" that was trained to recognize some 1,000 categories of images.

Armed with these two power technologies, the researchers figured out a way to fuse the two together so that the model could use both approaches when attempting to classify a new image.

This model is marginally more accurate than today's state-of-the-art systems and is inherently more flexible. The researchers hypothesized:

A DeViSE model that was trained on images with labels like "tiger shark", "bull shark", and "blue shark", but never with images labeled simply "shark", would likely have the ability to generalize to this more coarse-grained descriptor because the language model has learned a representation of the general concept of "shark" which is similar to all of the specific sharks. Similarly, if tested on images of highly specific classes which the model happens to have never seen before, for example a photo of an oceanic whitecap shark, and asked whether the correct label is more likely "oceanic whitecap shark" or some other unfamiliar label (say, "nuclear submarine"), our model stands a fighting chance of guessing correctly because the language model ensures that representation of "oceanic whitecap shark" is closer to the representation of sharks the model has seen, while the representation of "nuclear submarine" is closer to those of other sea vessels.

Subsequent experiments detailed in the paper bore out this theory.

Google believes the system has a broad range of applications in some of the search giant's trickiest problem areas.

"We believe that our model's unusual compatibility with larger, less manicured data sets will prove to be a major strength moving forward," the nine wrote. "Though here we trained on a curated academic image dataset, our model's architecture naturally lends itself to being trained on all available images that can be annotated with any text term contained in the (larger) vocabulary. We believe that training massive "open" image datasets of this form will dramatically improve the quality of visual object categorization systems."

And once Google has honed the capabilities of this tech further, it could be used for a multitude of problems, such as distinguishing between categories like dogs, cats, and lawnmowers, and also specific entities, like telling the difference between cars such as a "Honda Civic, Ferrari F355, Tesla Model-S" they note – capabilities that are crucial ingredients for further developments in Google's key business of highly targeted, automated advertising.

As oil is to the plastics industry, data is to Google: it is the fundamental resource on which the company depends, and the more it can refine it, the more money it can make from it. For this reason machine learning and other deep analytical approaches are a priority for Google as the ad-slinger attempts to automate the classification and tagging of an ever-swelling world of digital data, with this system it has devised another approach to let it slurp more cash from the ethereal digital world. ®

SANS - Survey on application security programs

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Batten down the hatches, Ubuntu 14.04 LTS due in TWO DAYS
Admins dab straining server brows in advance of Trusty Tahr's long-term support landing
Microsoft lobs pre-release Windows Phone 8.1 at devs who dare
App makers can load it before anyone else, but if they do they're stuck with it
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Oh no, Joe: WinPhone users already griping over 8.1 mega-update
Hang on. Which bit of Developer Preview don't you understand?
Internet-of-stuff startup dumps NoSQL for ... SQL?
NoSQL taste great at first but lacks proper nutrients, says startup cloud whiz
Windows 8.1, which you probably haven't upgraded to yet, ALREADY OBSOLETE
Pre-Update versions of new Windows version will no longer support patches
Microsoft TIER SMEAR changes app prices whether devs ask or not
Some go up, some go down, Redmond goes silent
Ditch the sync, paddle in the Streem: Upstart offers syncless sharing
Upload, delete and carry on sharing afterwards?
prev story

Whitepapers

Designing a defence for mobile apps
In this whitepaper learn the various considerations for defending mobile applications; from the mobile application architecture itself to the myriad testing technologies needed to properly assess mobile applications risk.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.