Feeds

Open source app can detect text's authors

Bible, US constitution analysed. Next: your kids school work and your email

Choosing a cloud hosting partner with confidence

A group of Adelaide researchers has released an open-source tool that helps identify document authorship by comparing texts.

While their own test cases – and therefore the headlines – concentrated on identifying the authors of historical documents, it seems to The Register that any number of modern uses of such a tool might arise.

The two test cases the researchers drew on in developing their software, on Github here, were a series of US essays called The Federalist Papers, and the Letter to the Hebrews in the New Testament.

The Federalist Paper essays were written in the lead-up to the drafting of the US Constitution, by Alexander Hamilton, James Madison and John Jay. Of the 85 essays, the authorship of 12 is disputed and one has generally been attributed to Jay.

Professor Derek Abbott of the University of Adelaide explains the results: “We’ve shown that one of the disputed texts, Essay 62, is indeed written by James Madison with a high degree of certainty.

“But the other 12 essays cannot be allocated to any of the three authors with a similarly strong likelihood. We believe they are probably the result of a certain degree of collaboration between the authors, which would also explain why there hasn’t been scholarly consensus to date.”

As for the Letter to the Hebrews, the analysis suggests it should be attributed to the Apostle Paul, but there’s enough evidence of someone else’s hand that it could either be a false positive, or it may indicate the personality of a translator as well as the author.

In the research paper, published in full at PLOSOne, the group notes that author attribution is a question that’s stretching beyond academia in the modern era.

“Due to an increase in the amount of data in various forms including emails, blogs, messages on the internet and SMS, the problem of author attribution has received more attention. In addition to its traditional application for shedding light on the authorship of disputed texts in the classical literature, new applications have arisen such as plagiarism detection, web searching, spam email detection, and finding the authors of disputed or anonymous documents in forensics against cyber crime,” the researchers write.

They note that further research would be needed to test their methodology against modern texts – but with the software offered for free, The Register can easily imagine the software getting a workout by any number of interested parties. ®

Secure remote control for conventional and virtual desktops

More from The Register

next story
UNIX greybeards threaten Debian fork over systemd plan
'Veteran Unix Admins' fear desktop emphasis is betraying open source
Netscape Navigator - the browser that started it all - turns 20
It was 20 years ago today, Marc Andreeesen taught the band to play
Redmond top man Satya Nadella: 'Microsoft LOVES Linux'
Open-source 'love' fairly runneth over at cloud event
Return of the Jedi – Apache reclaims web server crown
.london, .hamburg and .公司 - that's .com in Chinese - storm the web server charts
Chrome 38's new HTML tag support makes fatties FIT and SKINNIER
First browser to protect networks' bandwith using official spec
Admins! Never mind POODLE, there're NEW OpenSSL bugs to splat
Four new patches for open-source crypto libraries
Torvalds CONFESSES: 'I'm pretty good at alienating devs'
Admits to 'a metric ****load' of mistakes during work with Linux collaborators
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.