Feeds

Googlebooks crusade captures CAPTCHA king

Fights spam. Pumps OCR

5 things you didn’t know about cloud backup

Google has acquired reCAPTCHA, a free CAPTCHA service that also serves as a means of digitizing printed books and newspapers. Among other things, the Mountain View web giant is looking to juice its ever-controversial library-scanning Book Search project.

Google announced the acquisition this morning with a post to the Official Google Blog, and it couldn't help but trumpet the news with, yes, a CAPTCHA:

Google Acquires ReCaptcha

"The image above is a CAPTCHA — you can read it, but computers have a harder time interpreting the letters. We tried to make it hard for computers to recognize because we wanted to give humans the scoop first, but we're happy to announce to everybody now that Google has acquired reCAPTCHA, a company that provides CAPTCHAs to help protect more than 100,000 websites from spam and fraud," the post reads.

But its not just spam and fraud protection that interests the Mountain View Chocolate Factory. ReCAPTCHA is also a way for Google to improve the OCR (optical character recognition) technology it uses to digitize printed materials for both its Book Search and News Archive Search services.

In providing websites with CAPTCHAs - visual Turing tests that separate humans from machines - reCAPTCHA often includes text scanned from books and newspapers that can't be read with OCR. It pairs this unknown text with a recognized word or phrase. Website visitors are asked to read both words, and if they get the known word correct, ReCaptchas can assume they also read the unknown text correctly.

ReCAPTCHA - a Pittsburgh, Pennsylvania-based outfit that spun off from research originated at Carnegie Mellon University - is currently helping the New York Times to digitize its archive.

Luis von Ahn, the reCAPTCHA founder who co-authored Google's blog post, is one of the Carnegie Mellon researchers who coined the term CAPTCHA, short for Completely Automated Public Turing test to tell Computers and Humans Apart. ReCAPTCHAs first hit the web in 2007, and Ahn founded the company in 2008. The Carnege Mellon assistant computer science professor has not responded to our request for comment.

"Google is the best fit for reCAPTCHA," reads a canned statement from von Ahn tucked into a press release. "From the very start, people often assumed the project was connected to Google, so it only makes sense that reCAPTCHA Inc. ultimately would find a home within Google."

Von Ahn will remain on the Carnegie Mellon computer science faculty, but he will also work at Google's Pittsburgh engineering office, which is on the university's campus. In the press release, he indicated that reCAPTCHA aleady has close ties with Google. In 2006, the company licensed an Ahn-developed game for use in its Google Image Labeler. Terms of Google's acquisiton were not disclosed. ®

5 things you didn’t know about cloud backup

More from The Register

next story
BBC: We're going to slip CODING into kids' TV
Pureed-carrot-in-ice cream C++ surprise
6 Obvious Reasons Why Facebook Will Ban This Article (Thank God)
Clampdown on clickbait ... and El Reg is OK with this
Twitter: La la la, we have not heard of any NUDE JLaw, Upton SELFIES
If there are any on our site it is not our fault as we are not a PUBLISHER
Facebook, Google and Instagram 'worse than drugs' says Miley Cyrus
Italian boffins agree with popette's theory that haters are the real wrecking balls
Sit tight, fanbois. Apple's '$400' wearable release slips into early 2015
Sources: time to put in plenty of clock-watching for' iWatch
Facebook to let stalkers unearth buried posts with mobe search
Prepare to HAUNT your pal's back catalogue
Ex-IBM CEO John Akers dies at 79
An era disrupted by the advent of the PC
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?