Feeds

Data-mining technique outs authors of anonymous email

Unmasking trolls, one 'write-print' at a time

  • alert
  • submit to reddit

The Essential Guide to IT Transformation

Engineers and computer scientists say they have devised a novel method for identifying authors of anonymous emails that's reliable enough to be used in courts of law.

In a series of papers published over the past few years, the researchers from Concordia University in Montreal have described what they say is the first ever data-mining algorithm for identifying the most plausible author of an anonymous email. It works by establishing a “write-print” of each suspected author by quantifying unique patterns in each individual's email writings. It can be used to unmask authors of emails used in spam, phishing cyberbullying and other types of offenses.

“Our insight is that the write-print of an individual is the combinations of features that occur frequently in his/her written emails,” the researchers wrote in a paper (PDF) first published in the publication Digital Investigation. “The commonly used features are lexical, syntactical, structural and content-specific attributes. By matching the write-print with the malicious email, the true author can be identified.”

Characteristics include word usage, word sequence, common spelling and grammatical mistakes, vocabulary richness, hyphenation and punctuation.

The new approach differs from previous methods by filtering out characteristics found in two or more of the suspects' writing styles. So-called decision tree methods often attempt to use the same set of features to deduce the write-print of different suspects. By excluding the styles that multiple suspects share, the technique attempts to generate a unique signature for each potential author under investigation.

At the heart of the method is an algorithm known as AuthorMiner. It mathematically extracts frequent patterns found in suspects emails and then filters out those that are common to other suspects. It then compares the anonymous email with each of the generated write-prints to identify the closest match.

To test the method, they used it on a set of more than 200,000 emails written by 158 employees of Enron before the energy company was exposed for financial fraud. When finely tuned, the technique identified the author about 80 percent of the time.

Additional papers from the researchers – who include Farkhund Iqbal, Rachid Hadjidj, Benjamin Fung, and Mourad Debbabi – are available here. ®

Build a business case: developing custom apps

More from The Register

next story
14 antivirus apps found to have security problems
Vendors just don't care, says researcher, after finding basic boo-boos in security software
'Things' on the Internet-of-things have 25 vulnerabilities apiece
Leaking sprinklers, overheated thermostats and picked locks all online
iWallet: No BONKING PLEASE, we're Apple
BLE-ding iPhones, not NFC bonkers, will drive trend - marketeers
Only '3% of web servers in top corps' fully fixed after Heartbleed snafu
Just slapping a patched OpenSSL on a machine ain't going to cut it, we're told
How long is too long to wait for a security fix?
Synology finally patches OpenSSL bugs in Trevor's NAS
Israel's Iron Dome missile tech stolen by Chinese hackers
Corporate raiders Comment Crew fingered for attacks
Tor attack nodes RIPPED MASKS off users for 6 MONTHS
Traffic confirmation attack bared users' privates - but to whom?
Roll out the welcome mat to hackers and crackers
Security chap pens guide to bug bounty programs that won't fail like Yahoo!'s
Researcher sat on critical IE bugs for THREE YEARS
VUPEN waited for Pwn2Own cash while IE's sandbox leaked
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Maximize storage efficiency across the enterprise
The HP StoreOnce backup solution offers highly flexible, centrally managed, and highly efficient data protection for any enterprise.