Feeds

Think file-hosting sites guard your private data? Think again

Attacks already under way

Next gen security for virtualised datacentres

Academic researchers say they've uncovered weaknesses in dozens of the most popular file hosting sites that allow people to gain unauthorized access to data that's supposed to be available only to those selected by the user.

The services, which include sites such as RapidShare, FileFactory, and Easyshare, allow users to upload large files and make them available to anyone who knows the unique URI (or Uniform Resource Identifier) that's bound to each one. Users may post the link on websites or forums available to the public or share it in a single email to prevent all but the recipient from downloading it. RapidShare, for instance, says it can be used to “share your data with your friends, colleagues or family.”

But according to academics in Belgium and France, a “significant percentage” of the 100 FHSs (or file hosting services) they studied made it trivial for outsiders to access the files simply by guessing the URLs that are bound to each uploaded file. What's more, they presented evidence that such attacks, far from being theoretical, are already happening in the wild.

“These services adopt a security-through-obscurity mechanism where a user can access the uploaded files only by knowing the correct download URIs,” the researchers wrote in a paper presented at the most recent USENIX Workshop on Large-Scale Exploits and Emergent Threats. “While these services claim that these URIs are secret and cannot be guessed, our study shows that this is far from being true.”

The researchers said they trained web crawlers on the file services and uncovered hundreds of thousands of private files in less than a month. They also used the sites to store private files that contained internet beacons, so they'd know if anyone opened them. Over a month's span, 80 unique IP addresses accessed the so-called honey files 275 times, indicating that the weakness is already being exploited in the wild to harvest data many users believe isn't available for general consumption.

The weakness that's easiest to exploit was found on sites that use sequential identifiers in the download URIs. By writing scripts that enumerate the the IDs character by character, their crawler was able to locate almost 311,000 unique files over a period of 30 days. The researchers then ran searches on Microsoft's Bing.com to arrive at an estimate that 168,320, or 54 percent of them, were private because they hadn't been shared online.

“Unfortunately, the problem is extremely serious since the list of insecure FHSs using sequential IDs also includes some of the most popular names, often highly ranked by Alexa in the list of the top internet websites,” the researchers wrote. To prevent their findings from being abused, their report didn't say which sites are vulnerable to specific types of attacks.

Another common weakness involved the use of pseudorandom URIs for each uploaded file. By using brute-force attacks that cycled through every possible combination, the researchers were able to successfully guess a file's unique ID 1.1 times for every thousand attempts. Part of the weakness is the result of websites that used IDs that consisted of only numeric strings with a maximum length of six numbers. But even when services used IDs with alphanumeric characters or numbers with a length of eight, the researchers achieved similar success rates.

In other cases, file services used ID systems with enough complexity that rendered brute-force techniques ineffective or used CAPTCHAs or other mitigations. But the researchers were often able to guess the names anyway, in some cases by exploiting a directory traversal vulnerability in a webhosting program used by multiple services.

In other cases, they defeated the mitigations by using a feature that allows people to report copyright violations and other abuse to the site admins and combining it with a separate feature for deleting files. Because the feature on one site exposed the first 10 characters of a file's 14-character ID, the number of combinations to brute force was a manageable 65,536.

The researchers said the most effective countermeasure against the attacks is the use of encryption on the user's computer. They developed a proof-of-concept Firefox add-on that automatically encrypts and decrypts files upon upload and download and uses steganographic techniques to hide the encrypted files.

The researchers included Nick Nikiforakis, Steven Van Acker, Wouter Joosen, of the Katholieke Universiteit of Leuven in Belgium, and Marco Balduzzi and Davide Balzarotti of the Institute Eurecom in France. A PDF of their paper is here. ®

The essential guide to IT transformation

More from The Register

next story
Goog says patch⁵⁰ your Chrome
64-bit browser loads cat vids FIFTEEN PERCENT faster!
e-Borders fiasco: Brits stung for £224m after US IT giant sues UK govt
Defeat to Raytheon branded 'catastrophic result'
Chinese hackers spied on investigators of Flight MH370 - report
Classified data on flight's disappearance pinched
NIST to sysadmins: clean up your SSH mess
Too many keys, too badly managed
Attack flogged through shiny-clicky social media buttons
66,000 users popped by malicious Flash fudging add-on
Think crypto hides you from spooks on Facebook? THINK AGAIN
Traffic fingerprints reveal all, say boffins
prev story

Whitepapers

A new approach to endpoint data protection
What is the best way to ensure comprehensive visibility, management, and control of information on both company-owned and employee-owned devices?
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Maximize storage efficiency across the enterprise
The HP StoreOnce backup solution offers highly flexible, centrally managed, and highly efficient data protection for any enterprise.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.