Feeds

US law firm cleared of robots.txt DMCA hacking charge

Wayback Machine just screwed up, court says

The essential guide to IT transformation

Analysis Sometimes plaintiffs just don't know when to quit.

After losing a trademark infringement suit against a competitor, Healthcare Advocates - a patient advocacy organization based out of Philadelphia - sued the intellectual property law firm that represented the defendant in the trademark action, alleging that the firm had "hacked" the Wayback Machine in order to view blocked archives of its website.

The firm - Harding, Earley, Follmer & Frailey - used the Wayback Machine to look at past incarnations of Healthcare Advocates' site in order to gather evidence to defend against the original trademark infringement charges. Healthcare Advocates had a robots.txt file in place to prevent anyone from viewing the archived versions of its site, but the law firm was still able to bring up certain archived pages.

Healthcare Advocates argued that this constituted a circumvention of a technical measure designed to control access to a copyrighted work, which would violate the Digital Millenium Copyright Act. The company alleged that the firm used the Wayback Machine to bypass its technical measure, the robots.txt file, in order to view its copyrighted website.

The US District Court for the Eastern District of Pennsylvania wasn't buying it, however. The court last week pointed out that the law firm didn't do anything out of the ordinary in order to gain access to the archived pages that Healthcare Advocates had intended to block. Instead, the Wayback Machine simply malfunctioned and allowed the firm to view material that should have been blocked.

Normally, when the Wayback Machine receives a request for the archives of a site with a robots.txt file, it displays a blocked site error message. If any of the site's pages are not blocked, the error message will contain a link to those past versions.

Such a link came up when the law firm searched for the Healthcare Advocates site, even though the robots.txt file should have blocked all of the site's pages.

Apparently, a caching error caused by heavy server load on the days in question caused certain Internet Archive servers to "forget" that they had a copy of the Healthcare Advocates robots.txt file. Then, for unknown reasons, the servers overlooked the robots.txt file when querying Healthcare Advocates' website directly.

This allowed the law firm to view some pages, but not others. Healthcare Advocates never asserted that the law firm had anything to do with causing the excessive load on the days when it tried to view the archived pages, and it was undisputed that the law firm used nothing more than an ordinary web browser to use the Wayback Machine.

The court held that, since the technical prevention measure never actually stood between the law firm and the copyrighted material, the law firm couldn't have circumvented it. Or, to use the court's phrase, "[t]hey did not 'pick the lock' . . . because there was no lock to pick."

The law firm never bypassed the Wayback Machine's obstructions in order to view the pages. The archives simply showed up as if there had been no robots.txt file in place at all. With those facts, the judge concluded, there had been no "hack," and the law firm had not violated the DMCA.

Healthcare Advocates' real beef should have been with Internet Archive for allowing the pages to slip through, but the San Francisco organization settled their way out of the lawsuit last year. The terms of the agreement are unknown, but it allowed IA to avoid having a judgment against it from showing up in the public record.

This is fairly important when considered in the context of the judge's conclusion that a robots.txt file, under the facts of this case, actually does constitute a technical measure subject to the DMCA. That view, if adopted in other jurisdictions, could have some widespread implications for Internet programmers.

While the facts of the current case limit the reach of the robots.txt ruling, it does open the door for a more expansive view of robots.txt files in the future. There may come a day when anyone writing code that ignores a robots.txt could be on the hook for violating the DMCA.

Thus, it was a very good thing for Internet Archive that it got out of the suit when it did. If they had remained as a defendant, the facts of the case would have been much more expansive, and the judge would have had an opportunity to rule on the issue of whether a program that overlooked a robots.txt file violated the DMCA.

An adverse ruling on that issue could have caused some serious problems for Internet Archive, as well as for some big-name deep-pockets out there (*cough* GOOGLE! *cough*). It would have undoubtedly created a whole new class of lawsuit against web services that missed or ignored a robots.txt as they scoured the Internet.

Which, ironically, would have been good news for the law firm defendant in this case.

They won the battle, but they may have lost a huge new revenue stream. ®

Kevin Fayle is an attorney, web editor and writer in San Francisco.

The essential guide to IT transformation

More from The Register

next story
Britain's housing crisis: What are we going to do about it?
Rent control: Better than bombs at destroying housing
Top beak: UK privacy law may be reconsidered because of social media
Rise of Twitter etc creates 'enormous challenges'
GCHQ protesters stick it to British spooks ... by drinking urine
Activists told NOT to snap pics of staff at the concrete doughnut
What do you mean, I have to POST a PHYSICAL CHEQUE to get my gun licence?
Stop bitching about firearms fees - we need computerisation
Ex US cybersecurity czar guilty in child sex abuse website case
Health and Human Services IT security chief headed online to share vile images
We need less U.S. in our WWW – Euro digital chief Steelie Neelie
EC moves to shift status quo at Internet Governance Forum
Oz biz regulator discovers shared servers in EPIC FACEPALM
'Not aware' that one IP can hold more than one Website
prev story

Whitepapers

Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Advanced data protection for your virtualized environments
Find a natural fit for optimizing protection for the often resource-constrained data protection process found in virtual environments.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.