Feeds

Reddit programmer charged with massive data theft

Harvard ethics fellow accused of hacking MIT

Secure remote control for conventional and virtual desktops

A former employee of Reddit has been accused of hacking into the computer systems of the Massachusetts Institute of Technology and downloading almost 5 million scholarly documents from a nonprofit archive service.

Aaron Swartz, a 24-year-old researcher in Harvard University's Center for Ethics, broke into a locked computer-wiring closet in an MIT basement and used a switch there to gain unauthorized access the college's network, federal prosecutors alleged Tuesday. He then downloaded 4.8 million articles from JSTOR, an online archive of more than 1,000 academic journals, according to an indictment filed in US District Court in Boston.

“As JSTOR, and then MIT, became aware of these efforts to steal a vast proportion of JSTOR's archive, each took steps to block the flow of articles to Swartz's computer and thus to prevent him from redistributing them,” the court document stated. “Swartz, in turn, repeatedly altered the appearance of his Acer laptop and the apparent source of his automated demands to get around JSTOR's and MIT's blocks against his computer.”

Attempts to reach Swartz for comment weren't immediately successful. According to his personal website, he is a cofounder of the social news website Reddit, although many people dispute this characterization. He is also the author of numerous articles on a variety of topics including “the corrupting influence of big money on institutions.”

Members of Demand Progress, a nonprofit political action group Swartz founded, criticized the indictment.

“This makes no sense,” the group's executive director, David Segal, said in a statement. “It’s like trying to put someone in jail for allegedly checking too many books out of the library.”

When JSTOR blocked the MIT IP address Swartz used in September, for example, the Harvard fellow allegedly incremented a single digit and resumed his wholesale downloading binge, which was streamlined with a custom Python script. JSTOR at times responded by blocking huge ranges of IP addresses, causing legitimate JSTOR users at MIT to be denied access.

Eventually, MIT responded by blocking the MAC address of his Acer laptop, so Swartz allegedly spoofed the digital serial number, again by incrementing a single character of the address.

According to authorities, Swartz hid the laptop and a battery of external hard drives under a box in the wiring closet so they wouldn't be noticed by anyone who entered the enclosure. He then periodically swapped out the drives, taking pains to mask his face with a bicycle helmet to evade identification.

Of the 4.8 million documents allegedly downloaded, about 1.7 million of them were made available for purchase by independent publishers. Prosecutors said Swartz planned to dump the huge stash on one or more file-sharing sites.

Swartz was charged with computer intrusion, fraud, and data theft. If convicted, he faces a maximum of 35 years in prison, restitution and forfeiture, and a fine of $1 million. A PDF of the indictment is here. ®

This post was updated to clarify Swartz's position at Reddit and to add comments from Demand Progress.

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
Webcam hacker pervs in MASS HOME INVASION
You thought you were all alone? Nope – change your password, says ICO
You really need to do some tech support for Aunty Agnes
Free anti-virus software, expires, stops updating and p0wns the world
Meet OneRNG: a fully-open entropy generator for a paranoid age
Kiwis to seek random investors for crowd-funded randomiser
USB coding anarchy: Consider all sticks licked
Thumb drive design ruled by almighty buck
Attack reveals 81 percent of Tor users but admins call for calm
Cisco Netflow a handy tool for cheapskate attackers
Patch NOW! Microsoft slings emergency bug fix at Windows admins
Vulnerability promotes lusers to domain overlords ... oops
prev story

Whitepapers

Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Website security in corporate America
Find out how you rank among other IT managers testing your website's vulnerabilities.
Top 5 reasons to deploy VMware with Tegile
Data demand and the rise of virtualization is challenging IT teams to deliver storage performance, scalability and capacity that can keep up, while maximizing efficiency.