Feeds

The trails left in Web server logs – and who's seeing them

Fear of a million Big Brothers

  • alert
  • submit to reddit

Build a business case: developing custom apps

NEW YORK--The privacy advocates and civil libertarians at the 13th annual Computers, Freedom and Privacy conference sometimes seem dwarfed by the enormity of the projects they oppose -- larger-than-life enterprises worthy of a James Bond villain.

John Poindexter's Total Information Awareness project, if successful, would combine every government and private sector database into a massive data mining system capable of picking out aberrant behavior in the actions of seemingly-ordinary citizens. The Department of Homeland Security's CAPPS II program aims to run automatic background checks on every airline passenger in the U.S.

But the day before CFP 2003 began, a smaller invitation-only group of technologists and policy wonks met at the conference site to discuss a matter that some say is just as important to Internet privacy as any of the monolithic omniscient supercomputers being hatched in Washington... The humble Web server log.

Or more to the point, the countless thousands of logs routinely kept by servers throughout the Internet, each marking every visit to a given website, identifying what pages were viewed, what transactions made, and the Internet IP address of the visitor. Recent laws have made it easier for government agencies to get their hands on server log entries, and civil litigators are increasingly finding logs a valuable target for subpoenas. At the same time, the art of wringing every ounce of useful information out of such logs is advancing, as is the ease of tracking down a user's identity from their IP address by correlating data from different sources.

Last month, scientists at Carnegie Mellon University's Laboratory for International Data Privacy even published a formal algorithm for "re-identifying" a Web surfer from pieces of information left like breadcrumbs on different sites. "The methodology involves constructing trails across locations from small amounts of seemingly anonymous or innocuous evidence the person has been there," the paper reads.

That's a troubling prospect to privacy advocates, at a time when activists and human rights workers in repressive countries are using the Internet to communicate, while ordinary netizens are turning to the Web for things like medical information or personal finance. "It's our sense that certain companies have entire staffs dedicated to handling subpoenas and court orders, and quite often those subpoenas and court orders involve usage logs," says Will Doherty of the Electronic Frontier Foundation.

Smaller companies may be keeping logs without thinking about the potential for misuse, and a careful Google search can turn up random server and proxy logs sitting unprotected on the Web. "Most people don't give it any thought; their default is to just log anything in Apache or IIS," says Richard Smith, a technology and privacy consultant. "At most, they have to worry about how much disk space it's taking up."

It's with that vision of a million tiny surveillance logs growing like weeds that the informal "User Log Data Management Working Group" had that first day-long meeting Tuesday. "We got as far as discovering the extent of the problem, and some sense of who had an interest in it," says Jeff Ubois, the workshop's organizer. Among the 18-odd attendees, which included Doherty and Smith, the meeting drew Internet archivist Brewster Kahle, FTC consumer-protection attorney Laura Mazzarella, and John Young, curator of the controversial full-disclosure cryptography and intelligence site Cryptome.org. Young, who himself has received at least one broad subpoena for usage log information, takes pride in deleting his logs on a daily basis.

Nobody expects Yahoo or MSNBC.com to delete their logs every day. But attendees say the workshop concluded that companies of all sizes need to become more familiar with the privacy risks of their routine logging. The group plans to launch an education campaign to dispel the notion that Internet surfing is anonymous by default. "If it becomes widely believed that IP addresses are personally identifiable, that has implications for businesses that are logging them," says Ubois.

The group is also working on specifications for a free open-source tool that would allow administrators to easily trim unwanted information from their logs. Smith, who occasionally moonlights as a forensic crime fighter, admits that Web server logs can serve a valuable purpose in tracking down bad guys. But he says webmasters should know the significance of the data they routinely collect. "Most of this is about educating people that this could leave them in the legal line of fire," he says.

© SecurityFocus logo

Endpoint data privacy in the cloud is easier than you think

More from The Register

next story
14 antivirus apps found to have security problems
Vendors just don't care, says researcher, after finding basic boo-boos in security software
'Things' on the Internet-of-things have 25 vulnerabilities apiece
Leaking sprinklers, overheated thermostats and picked locks all online
iWallet: No BONKING PLEASE, we're Apple
BLE-ding iPhones, not NFC bonkers, will drive trend - marketeers
Multipath TCP speeds up the internet so much that security breaks
Black Hat research says proposed protocol will bork network probes, flummox firewalls
Only '3% of web servers in top corps' fully fixed after Heartbleed snafu
Just slapping a patched OpenSSL on a machine ain't going to cut it, we're told
Microsoft's Euro cloud darkens: US FEDS can dig into foreign servers
They're not emails, they're business records, says court
Plug and PREY: Hackers reprogram USB drives to silently infect PCs
BadUSB instructs gadget chips to inject key-presses, redirect net traffic and more
How long is too long to wait for a security fix?
Synology finally patches OpenSSL bugs in Trevor's NAS
prev story

Whitepapers

7 Elements of Radically Simple OS Migration
Avoid the typical headaches of OS migration during your next project by learning about 7 elements of radically simple OS migration.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Solving today's distributed Big Data backup challenges
Enable IT efficiency and allow a firm to access and reuse corporate information for competitive advantage, ultimately changing business outcomes.
A new approach to endpoint data protection
What is the best way to ensure comprehensive visibility, management, and control of information on both company-owned and employee-owned devices?