The Register® — Biting the hand that feeds IT

Feeds

Open source app can detect text's authors

Bible, US constitution analysed. Next: your kids school work and your email

Customer Success Testimonial: Recovery is Everything

A group of Adelaide researchers has released an open-source tool that helps identify document authorship by comparing texts.

While their own test cases – and therefore the headlines – concentrated on identifying the authors of historical documents, it seems to The Register that any number of modern uses of such a tool might arise.

The two test cases the researchers drew on in developing their software, on Github here, were a series of US essays called The Federalist Papers, and the Letter to the Hebrews in the New Testament.

The Federalist Paper essays were written in the lead-up to the drafting of the US Constitution, by Alexander Hamilton, James Madison and John Jay. Of the 85 essays, the authorship of 12 is disputed and one has generally been attributed to Jay.

Professor Derek Abbott of the University of Adelaide explains the results: “We’ve shown that one of the disputed texts, Essay 62, is indeed written by James Madison with a high degree of certainty.

“But the other 12 essays cannot be allocated to any of the three authors with a similarly strong likelihood. We believe they are probably the result of a certain degree of collaboration between the authors, which would also explain why there hasn’t been scholarly consensus to date.”

As for the Letter to the Hebrews, the analysis suggests it should be attributed to the Apostle Paul, but there’s enough evidence of someone else’s hand that it could either be a false positive, or it may indicate the personality of a translator as well as the author.

In the research paper, published in full at PLOSOne, the group notes that author attribution is a question that’s stretching beyond academia in the modern era.

“Due to an increase in the amount of data in various forms including emails, blogs, messages on the internet and SMS, the problem of author attribution has received more attention. In addition to its traditional application for shedding light on the authorship of disputed texts in the classical literature, new applications have arisen such as plagiarism detection, web searching, spam email detection, and finding the authors of disputed or anonymous documents in forensics against cyber crime,” the researchers write.

They note that further research would be needed to test their methodology against modern texts – but with the software offered for free, The Register can easily imagine the software getting a workout by any number of interested parties. ®

Regcast training : Hyper-V 3.0, VM high availability and disaster recovery

Re: Turnitin

Isn't the point that Turnitin identifies when texts are the same (it identifies when text A is pretty much the same as text B, probably because it's been ripped off) but that this software identifies the author of two different texts? In other words, it'll tell you if the same person (probably) wrote Twelfth Night and As You Like It, but not whether As You Like It by T Mangrove is the same as As You Like It by W Shakespeare.

Does anyone have any experience of working with Turnitin etc? Does it work?

6
0

Re: Turnitin

In my experience, Turnitin is a steaming pile of horse manure that spews false positives.

Apart from describing my work as plagiarism for using the same page numbering template in the header of a word document as a student from Manchester University it enjoys highlighting my use of three common words together as a form of intellectual theft.

Anyway, I believe they are different technologies. Turnitin checks for roughly the same content, structure, sentences etc. to establish originality whereas this project assesses the style of a text to see if it matches samples of an author's known texts to establish authorship.

4
0

Does it detect the Quran as being the word of god. ? (Does it think the bible is by the same hand ?)

3
0

More from The Register

SCO vs. IBM battle resumes over ownership of Unix
Zombie lawsuit back and wants to suck the brains out of Linux
Bjarne Again: Hallelujah for C++
Plus: Now officially OK to admit you never used STL algorithms
Interwebs taunt Sir Jony over Apple eye candy makeover
Hey Ive, Ive... add more unicorns, willya?
Apple: iOS7 dayglo Barbie makeover is UNFINISHED - report
Plus: You don't like the icons? Blame marketing
Red Hat to ditch MySQL for MariaDB in RHEL 7
So long, Oracle! Don't let the door hit you on the way out
Shy? Socially inadequate? Fiddling with your phone could help
App 'tells the brutal truth' about social inadequates' chatup lines
Java EE 7 melds HTML5 with enterprise apps
New release arrives with GlassFish, NetBeans support
 breaking news
'Office Facebook' firm Tibbr wants you to PAY for mobe-meetings app
Great idea. Punters won't cough for it though
 breaking news
The only Waze is Google: Ad giant tipped to gobble map app 'for $1.3bn'
Pac-Man-satnav-ish upstart in bidding war with Apple, Facebook
 breaking news
PM Cameron calls for modern, programmable computers! (We think)
IT education musings to G8 chiefs to mystify IT industry