Feeds

US law firm cleared of robots.txt DMCA hacking charge

Wayback Machine just screwed up, court says

Beginner's guide to SSL certificates

Analysis Sometimes plaintiffs just don't know when to quit.

After losing a trademark infringement suit against a competitor, Healthcare Advocates - a patient advocacy organization based out of Philadelphia - sued the intellectual property law firm that represented the defendant in the trademark action, alleging that the firm had "hacked" the Wayback Machine in order to view blocked archives of its website.

The firm - Harding, Earley, Follmer & Frailey - used the Wayback Machine to look at past incarnations of Healthcare Advocates' site in order to gather evidence to defend against the original trademark infringement charges. Healthcare Advocates had a robots.txt file in place to prevent anyone from viewing the archived versions of its site, but the law firm was still able to bring up certain archived pages.

Healthcare Advocates argued that this constituted a circumvention of a technical measure designed to control access to a copyrighted work, which would violate the Digital Millenium Copyright Act. The company alleged that the firm used the Wayback Machine to bypass its technical measure, the robots.txt file, in order to view its copyrighted website.

The US District Court for the Eastern District of Pennsylvania wasn't buying it, however. The court last week pointed out that the law firm didn't do anything out of the ordinary in order to gain access to the archived pages that Healthcare Advocates had intended to block. Instead, the Wayback Machine simply malfunctioned and allowed the firm to view material that should have been blocked.

Normally, when the Wayback Machine receives a request for the archives of a site with a robots.txt file, it displays a blocked site error message. If any of the site's pages are not blocked, the error message will contain a link to those past versions.

Such a link came up when the law firm searched for the Healthcare Advocates site, even though the robots.txt file should have blocked all of the site's pages.

Apparently, a caching error caused by heavy server load on the days in question caused certain Internet Archive servers to "forget" that they had a copy of the Healthcare Advocates robots.txt file. Then, for unknown reasons, the servers overlooked the robots.txt file when querying Healthcare Advocates' website directly.

This allowed the law firm to view some pages, but not others. Healthcare Advocates never asserted that the law firm had anything to do with causing the excessive load on the days when it tried to view the archived pages, and it was undisputed that the law firm used nothing more than an ordinary web browser to use the Wayback Machine.

The court held that, since the technical prevention measure never actually stood between the law firm and the copyrighted material, the law firm couldn't have circumvented it. Or, to use the court's phrase, "[t]hey did not 'pick the lock' . . . because there was no lock to pick."

The law firm never bypassed the Wayback Machine's obstructions in order to view the pages. The archives simply showed up as if there had been no robots.txt file in place at all. With those facts, the judge concluded, there had been no "hack," and the law firm had not violated the DMCA.

Healthcare Advocates' real beef should have been with Internet Archive for allowing the pages to slip through, but the San Francisco organization settled their way out of the lawsuit last year. The terms of the agreement are unknown, but it allowed IA to avoid having a judgment against it from showing up in the public record.

This is fairly important when considered in the context of the judge's conclusion that a robots.txt file, under the facts of this case, actually does constitute a technical measure subject to the DMCA. That view, if adopted in other jurisdictions, could have some widespread implications for Internet programmers.

While the facts of the current case limit the reach of the robots.txt ruling, it does open the door for a more expansive view of robots.txt files in the future. There may come a day when anyone writing code that ignores a robots.txt could be on the hook for violating the DMCA.

Thus, it was a very good thing for Internet Archive that it got out of the suit when it did. If they had remained as a defendant, the facts of the case would have been much more expansive, and the judge would have had an opportunity to rule on the issue of whether a program that overlooked a robots.txt file violated the DMCA.

An adverse ruling on that issue could have caused some serious problems for Internet Archive, as well as for some big-name deep-pockets out there (*cough* GOOGLE! *cough*). It would have undoubtedly created a whole new class of lawsuit against web services that missed or ignored a robots.txt as they scoured the Internet.

Which, ironically, would have been good news for the law firm defendant in this case.

They won the battle, but they may have lost a huge new revenue stream. ®

Kevin Fayle is an attorney, web editor and writer in San Francisco.

Choosing a cloud hosting partner with confidence

More from The Register

next story
The 'fun-nification' of computer education – good idea?
Compulsory code schools, luvvies love it, but what about Maths and Physics?
Facebook, Apple: LADIES! Why not FREEZE your EGGS? It's on the company!
No biological clockwatching when you work in Silicon Valley
Happiness economics is bollocks. Oh, UK.gov just adopted it? Er ...
Opportunity doesn't knock; it costs us instead
Ex-US Navy fighter pilot MIT prof: Drones beat humans - I should know
'Missy' Cummings on UAVs, smartcars and dying from boredom
Yes, yes, Steve Jobs. Look what I'VE done for you lately – Tim Cook
New iPhone biz baron points to Apple's (his) greatest successes
Lords take revenge on REVENGE PORN publishers
Jilted Johns and Jennies with busy fingers face two years inside
Sysadmin with EBOLA? Gartner's issued advice to debug your biz
Start hoarding cleaning supplies, analyst firm says, and assume your team will scatter
Edward who? GCHQ boss dodges Snowden topic during last speech
UK spies would rather 'walk' than do 'mass surveillance'
Doctor Who's Flatline: Cool monsters, yes, but utterly limp subplots
We know what the Doctor does, stop going on about it already
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.