Security

This article is more than 1 year old

Talk about a GAN-do attitude... AI software bots can see through your text CAPTCHAs

Code to defeat letter-based I'm-a-human tests revealed, major sites left wide open

Wed 5 Dec 2018 // 22:28 UTC

If you're one of those people who hates picking out cars, street signs and other objects in CAPTCHA image grids, then get used to it because the days of text-based alternatives are numbered.

CAPTCHA stands for "Completely Automated Public Turing test to tell Computers and Humans Apart." CAPTCHA tests are used to separate bots from people, as many internet users have seen.

They don't work flawlessly, which is why companies like Facebook are constantly purging fake accounts. And ongoing research into machine learning and image recognition techniques is making it harder still to design puzzles that vex software but not humans.

Boffins at Lancaster University in the UK, Northwest University in the US, and Peking University in China have devised an approach for creating text-based CAPTCHA solvers that makes it trivial to automatically decipher scrambled depictions of text.

Researchers Guixin Ye, Zhanyong Tang, Dingyi Fang, Zhanxing Zhu, Yansong Feng, Pengfei Xu, Xiaojiang Chen, and Zheng Wang describe their CAPTCHA cracking system in a paper that was presented at the 25th ACM Conference on Computer and Communications Security in October and now released to the public.

As can be surmised from the title, "Yet Another Text Captcha Solver: A Generative Adversarial Network Based Approach," the computer scientists used a GAN (Generative Adversarial Network) to teach their CAPTCHA generator, which is used for training their text recognition model.

First described in 2014, a GAN consists of two neural network models pitted against each other as adversaries, one simulating something and the other spotting problems with the simulation until any differences can not longer be identified.

Coincidentally, that's the same year researchers from Google and Stanford published a paper titled, "The End is Nigh: Generic Solving of Text-based CAPTCHAs." Four years on, the speed bumps limiting generic attacks have been paved over.

Can we break it? Yes we GAN!

A GAN turns out to be well-suited for efficiently training data models. It allowed the researchers to teach their CAPTCHA generation program to quickly create lots of synthetic text puzzles to train their basic puzzle solving model. They then fine-tuned it via transfer learning to defeat real text jumbles using only a small set (~500 instead of millions) of actual samples.

Numerous attacks on text-based CAPTCHAs have been devised over the years, the researchers say, but the need to train attack mechanisms to handle specific text-munging techniques has limited how fast attackers can respond to CAPTCHA changes.

"Tuning the attacking heuristics or models requires heavy expert involvement and follows a labor-intensive and time-consuming process of data gathering and labeling," they explain in the paper.

Facebook open-sources object detection work: Watch out, Google CAPTCHA

While there have been some generic attacks proposed, they've worked only on relatively simple security features like noisy backgrounds and single fonts.

The researchers contend that by reducing human involvement and the effort to create a targeted CAPTCHA solver, their attack represents "a particular serious threat for text-based CAPTCHAs."

The boffins tested 33 text-based CAPTCHA schemes, of which 11 were being used by 32 of the Alexa-ranked top 50 websites as of April this year. And they were able to crack them in less than 50 milliseconds using a desktop GPU.

Who in their right mind would still be using text-based CAPTCHAs when image-based alternatives are available and Google in October revitalized its reCAPTCHA tech? Quite a few companies it turns out, among them Baidu, eBay, Google, Microsoft, and Wikipedia.

But with object-identification CAPTCHAs also yielding to machine learning-based attacks, it may be time to look beyond Turing tests. ®

Topics

Special Features

Vendor Voice

Resources

Security

Talk about a GAN-do attitude... AI software bots can see through your text CAPTCHAs

Code to defeat letter-based I'm-a-human tests revealed, major sites left wide open

Can we break it? Yes we GAN!

Facebook open-sources object detection work: Watch out, Google CAPTCHA

More about

More about

Narrower topics

More about

More about

More about

Narrower topics

TIP US OFF

Other stories you might like

OpenAI's GPT-4 can exploit real vulnerabilities by reading security advisories

Microsoft slammed for lax security that led to China's cyber-raid on Exchange Online

Rust developers at Google are twice as productive as C++ teams

Protecting distributed branch office environments from ransomware

Sleuths who cracked Zodiac Killer's cipher thank the crowd

Meet clickjacking's slicker cousin, 'gesture jacking,' aka 'cross window forgery'

CHIPS Act hangover sees most US science agency budgets cut for 2024

Microsoft squashes SmartScreen security bypass bug exploited in the wild

US government excoriates Microsoft for 'avoidable errors' but keeps paying for its products

Cisco creates architecture to improve security and sell you new switches

Chrome Enterprise Premium promises extra security – for a fee

Malicious SSH backdoor sneaks into xz, Linux world's data compression library

About Us

Our Websites

Your Privacy