Feeds

Linguists use sounds to bypass Skype crypto

And you thought grammar was useless…

Beginner's guide to SSL certificates

Decryption is difficult and computationally expensive. So what if, instead of decrypting the content of a message, you found a correlation between the encrypted data and its meaning – without having to crack the code itself?

Such an approach has been demonstrated by a group of University of North Carolina linguists working with computer scientists on encrypted Skype calls. While their research paper only managed to partially recover conversations, an encryption scheme that leaks even some of the data it’s meant to protect is no longer secure.

It works like this: spoken English has a set of known – and quite settled – rules for its phonetic grammar.

For non-linguists, this means the order in which we can and cannot put different sounds together. The “ds” sound, or phoneme, at the end of sounds is fairly common at the end of English words, but doesn’t occur at the beginning.

Systems like speech-to-text converters use these rules to break strings of sounds into individual words; they match sounds against a dictionary of legal phoneme combinations and map these into words. What the researchers discovered is that encryption leaves a pattern that can be subjected to this kind of analysis – without decrypting the data.

When you encode spoken English for VoIP using (in the case of Skype) CELP (code excited linear projection), you will end up with patterns in the data that match the patterns in the sounds. In particular, those patterns end up being reflected in the size of the data frame: the more complex the sound that’s being encoded, the larger the frame, resulting in a correlation between frame size and the original sounds spoken.

When the data created by CELP is encrypted, it retains the original frame size – and that means that even encrypted Skype data will retain the correlation between the size of the data frame and the original phonemes.

The technique gets another helping hand: at least some of the time, boundaries between sounds correspond to sudden changes in frame size, hinting at the difference between “Han Solo” and “Hans Solo”.

The researchers mapped the size of encrypted data frames in the Skype stream back to likely patterns of phonemes, and used that mapping – which they called “Phonetic Reconstruction” – to reconstruct the call, without decrypting the data.

So how well does it work? Not so well that we should all abandon Skype tomorrow. However, the researchers noted that if an encryption scheme is to be considered secure, “no reconstruction, even a partial one, should be possible; indeed, any cryptographic system that leaked as much information as shown here would immediately be deemed insecure.”

Bigger phoneme-word dictionaries (covering more dialects and languages) and faster processing would improve the accuracy of this kind of analysis ®

Choosing a cloud hosting partner with confidence

More from The Register

next story
Webcam hacker pervs in MASS HOME INVASION
You thought you were all alone? Nope – change your password, says ICO
You really need to do some tech support for Aunty Agnes
Free anti-virus software, expires, stops updating and p0wns the world
USB coding anarchy: Consider all sticks licked
Thumb drive design ruled by almighty buck
Attack reveals 81 percent of Tor users but admins call for calm
Cisco Netflow a handy tool for cheapskate attackers
Privacy bods offer GOV SPY VICTIMS a FREE SPYWARE SNIFFER
Looks for gov malware that evades most antivirus
Patch NOW! Microsoft slings emergency bug fix at Windows admins
Vulnerability promotes lusers to domain overlords ... oops
prev story

Whitepapers

Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Reducing the cost and complexity of web vulnerability management
How using vulnerability assessments to identify exploitable weaknesses and take corrective action can reduce the risk of hackers finding your site and attacking it.
Top 5 reasons to deploy VMware with Tegile
Data demand and the rise of virtualization is challenging IT teams to deliver storage performance, scalability and capacity that can keep up, while maximizing efficiency.