Stolen passwords integrated into the ultimate dictionary attack
Humans still the weakest link
Targeted password guessing turns out to be significantly easier than it should be, thanks to the online availability of personal information, leaked passwords associated with other accounts, and our tendency to incorporate personal data into our security codes.
In a paper [PDF] presented at the ACM Conference of Communication and Systems Security (CCS) in late October, security researchers from China and the UK describe a system for targeted password guessing that finds that a sizable fraction of people's online passwords are vulnerable to attack.
The researchers – Ding Wang, Zijian Zhang and Ping Wang from Peking University, Jeff Yan of Lancaster University, and Xinyi Huang from Fujian Normal University – claim that this threat is significantly underestimated.
Using a targeted password-guessing framework named TarGuess, the researchers achieved success rates as high as 73 per cent with just 100 guesses against typical users, and as high as 32 per cent against security-savvy users.
The researchers used ten large real-world password datasets that have been exposed online, five from English sites, including Yahoo, and five from Chinese sites, including Dodonew.
"Our results suggest that the currently used security mechanisms would be largely ineffective against the targeted online guessing threat, and this threat has already become much more damaging than expected," the researchers state in their paper. "We believe that the new algorithms and knowledge of effectiveness of targeted guessing models can shed light on both existing password practice and future password research."
More or less everyone in the computer security industry and many internet users are aware that passwords offer inadequate security when poorly constructed. As the report notes, between 0.79 per cent and 10.44 per cent of user-chosen passwords, depending on the sample breach data set, can be guessed using the ten most popular passwords, a list that includes perennial favorites "12345" and "password."
Low-hanging fruit aside, the researchers note that a small percentage of people use their personal information in their passwords. Between 0.75 per cent and 1.87 per cent of individuals use their full names as their passwords, for instance. Among users in China, where numbers are commonly used in passwords, between 1 per cent and 5.16 per cent use their birthdays as passwords. Email addresses and usernames also get used.
What's more, people often reuse passwords, in whole or in part. And thanks to security breaches that have resulted in the exposure of personal information for hundreds of millions of online accounts, this research shows that it's sometimes possible to use publicly accessible data about an individual, from hacked accounts or otherwise, to gain access to other accounts used by that person.
The researcher's TarGuess algorithms – they made four of them – proved most successful when "sister" passwords – passwords for another account owned by the target – were known. But even when sister passwords were not available, they still achieved success rates ranging from 20% with 100 guesses to 50% with 106 guesses.
The researchers achieved higher success rates when more user information was available to them: They were able to guess the passwords of users of Chinese train ticketing site 12306 about 20 per cent of the time when they knew users' email addresses, account names, birthdays, phone numbers, and national identity numbers. The success rate dropped to about 6 per cent when only users' names were known.
"This suggests that the majority of normal users' passwords are prone to a small number of targeted online guesses," the researchers said, noting that this invalidates 2016 NIST guidance that service providers should limit the number of consecutive failed login attempts to 100 each month.
The findings underscore the need for education about how to create strong passwords, and about tools like password managers that allow people to maintain dozens of sufficiently long, complicated codes that have no common patterns. ®