Feeds

US law firm cleared of robots.txt DMCA hacking charge

Wayback Machine just screwed up, court says

High performance access to file storage

Analysis Sometimes plaintiffs just don't know when to quit.

After losing a trademark infringement suit against a competitor, Healthcare Advocates - a patient advocacy organization based out of Philadelphia - sued the intellectual property law firm that represented the defendant in the trademark action, alleging that the firm had "hacked" the Wayback Machine in order to view blocked archives of its website.

The firm - Harding, Earley, Follmer & Frailey - used the Wayback Machine to look at past incarnations of Healthcare Advocates' site in order to gather evidence to defend against the original trademark infringement charges. Healthcare Advocates had a robots.txt file in place to prevent anyone from viewing the archived versions of its site, but the law firm was still able to bring up certain archived pages.

Healthcare Advocates argued that this constituted a circumvention of a technical measure designed to control access to a copyrighted work, which would violate the Digital Millenium Copyright Act. The company alleged that the firm used the Wayback Machine to bypass its technical measure, the robots.txt file, in order to view its copyrighted website.

The US District Court for the Eastern District of Pennsylvania wasn't buying it, however. The court last week pointed out that the law firm didn't do anything out of the ordinary in order to gain access to the archived pages that Healthcare Advocates had intended to block. Instead, the Wayback Machine simply malfunctioned and allowed the firm to view material that should have been blocked.

Normally, when the Wayback Machine receives a request for the archives of a site with a robots.txt file, it displays a blocked site error message. If any of the site's pages are not blocked, the error message will contain a link to those past versions.

Such a link came up when the law firm searched for the Healthcare Advocates site, even though the robots.txt file should have blocked all of the site's pages.

Apparently, a caching error caused by heavy server load on the days in question caused certain Internet Archive servers to "forget" that they had a copy of the Healthcare Advocates robots.txt file. Then, for unknown reasons, the servers overlooked the robots.txt file when querying Healthcare Advocates' website directly.

This allowed the law firm to view some pages, but not others. Healthcare Advocates never asserted that the law firm had anything to do with causing the excessive load on the days when it tried to view the archived pages, and it was undisputed that the law firm used nothing more than an ordinary web browser to use the Wayback Machine.

The court held that, since the technical prevention measure never actually stood between the law firm and the copyrighted material, the law firm couldn't have circumvented it. Or, to use the court's phrase, "[t]hey did not 'pick the lock' . . . because there was no lock to pick."

The law firm never bypassed the Wayback Machine's obstructions in order to view the pages. The archives simply showed up as if there had been no robots.txt file in place at all. With those facts, the judge concluded, there had been no "hack," and the law firm had not violated the DMCA.

Healthcare Advocates' real beef should have been with Internet Archive for allowing the pages to slip through, but the San Francisco organization settled their way out of the lawsuit last year. The terms of the agreement are unknown, but it allowed IA to avoid having a judgment against it from showing up in the public record.

This is fairly important when considered in the context of the judge's conclusion that a robots.txt file, under the facts of this case, actually does constitute a technical measure subject to the DMCA. That view, if adopted in other jurisdictions, could have some widespread implications for Internet programmers.

While the facts of the current case limit the reach of the robots.txt ruling, it does open the door for a more expansive view of robots.txt files in the future. There may come a day when anyone writing code that ignores a robots.txt could be on the hook for violating the DMCA.

Thus, it was a very good thing for Internet Archive that it got out of the suit when it did. If they had remained as a defendant, the facts of the case would have been much more expansive, and the judge would have had an opportunity to rule on the issue of whether a program that overlooked a robots.txt file violated the DMCA.

An adverse ruling on that issue could have caused some serious problems for Internet Archive, as well as for some big-name deep-pockets out there (*cough* GOOGLE! *cough*). It would have undoubtedly created a whole new class of lawsuit against web services that missed or ignored a robots.txt as they scoured the Internet.

Which, ironically, would have been good news for the law firm defendant in this case.

They won the battle, but they may have lost a huge new revenue stream. ®

Kevin Fayle is an attorney, web editor and writer in San Francisco.

High performance access to file storage

More from The Register

next story
Android engineer: We DIDN'T copy Apple OR follow Samsung's orders
Veep testifies for Samsung during Apple patent trial
MtGox chief Karpelès refuses to come to US for g-men's grilling
Bitcoin baron says he needs another lawyer for FinCEN chat
Did a date calculation bug just cost hard-up Co-op Bank £110m?
And just when Brit banking org needs £400m to stay afloat
One year on: diplomatic fail as Chinese APT gangs get back to work
Mandiant says past 12 months shows Beijing won't call off its hackers
German space centre endures cyber attack
Chinese code retrieved but NSA hack not ruled out
EFF: Feds plan to put 52 MILLION FACES into recognition database
System would identify faces as part of biometrics collection
Big Content goes after Kim Dotcom
Six studios sling sueballs at dead download destination
Ex-Tony Blair adviser is new top boss at UK spy-hive GCHQ
Robert Hannigan to replace Sir Iain Lobban in the autumn
Alphadex fires back at British Gas with overcharging allegation
Brit colo outfit says it paid for 347KVA, has been charged for 1940KVA
Jack the RIPA: Blighty cops ignore law, retain innocents' comms data
Prime minister: Nothing to see here, go about your business
prev story

Whitepapers

Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
HP ArcSight ESM solution helps Finansbank
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.