Websites could be required to retain visitor info
Even if it would break their privacy policies
A series of legal events means that companies that have no business reason to retain documents or records may be compelled to create and retain such records just so they can become available for discovery.
Companies routinely create, maintain and store electronic records. Some records are consciously created – like memoranda, letters, spreadsheets, and even e-mails and chat or instant message communications. Other records are created inadvertently, like meta data, log records, IP history records and the like. Some information is useful to the company, and it wants to retain it, and other information is of little use, merely takes up space, creates potential liability, and represents an unwarranted threat for attack or violation of privacy. The problem for most companies in developing or maintaining a document retention/destruction policy is identifying the documents and records it wants to keep and effectively purging the ones it doesn't want. Some recent legal events have made the problem of document retention and destruction even more complicated.
TorrentSpy.com is committed to protecting your privacy. TorrentSpy.com does not sell, trade or rent your personal information to other companies. TorrentSpy.com will not collect any personal information about you except when you specifically and knowingly provide such information.
Pretty straightforward, and not too dissimilar from thousands of other website privacy policies. Such privacy policies are considered to be legally binding contracts, and the United States Federal Trade Commission, and Privacy Commissioners in Europe, Asia and other places routinely hold companies to their promises – under threat of civil and criminal prosecution or fines.
If you are engaging in malicious, unlawful, or otherwise "actionable" conduct, the website operator may certainly attempt to use this information to identify you and discern what you are doing – the essence of "personal information". Indeed, much of what we do as forensic investigators is to use this kind of information to find people.
While net-savvy individuals know that this information is being collected and utilized, the vast majority of individuals would not say that they "specifically and knowingly" provided that information to the website. This information frequently has economic value to the website operator as well. Knowing what site referred the user may result in payments from or to the referring site under "pay per click" agreements.
Log information - When you use Google services, our servers automatically record information that your browser sends whenever you visit a website. These server logs may include information such as your web request, Internet Protocol address, browser type, browser language, the date and time of your request and one or more cookies that may uniquely identify your browser.
Some of this information is collected automatically as a consequence of delivering web content to the requestor. You would think that, in pursuance of its privacy policies, a company could choose not to collect or more accurately not to store or retain such information – after all, that's what they promised their customers, no?
There has long been an adage in the law that essentially states that "if it exists, it is discoverable". Now, as a result of a lawsuit involving TorrentSpy, the United States District Court for the Central District of California has essentially extended this logic to state that, "if it doesn't exist, we will require that it be created and stored so that it can become discoverable".
The case, Columbia Pictures v. Bunnell (pdf) arose when the movie studios wanted to find out the identity of people using TorrentSpy to download copyrighted works – personal information about TorrentSpy's users. TorrentSpy promised its users that it wouldn't collect such information, and had no legal obligation to do so. As the court noted:
In general, when a user clicks on a link to a page or a file on a website, the website's web server program receives from the user a request for the page or the file. The request includes the IP address of the user's computer, and the name of the requested page or file, among other things. Such information is copied into and stored in RAM.).
RAM is a form of temporary storage that every computer uses to process data. Every user request for a page or file is stored by the web server program in RAM in this fashion. The web server interprets and processes that data, while it is stored in RAM, in order to respond to user requests.
The web server then satisfies the request by sending the requested file to the user. If the website's logging function is enabled, the web server copies the request into a log file, as well as the fact that the requested file was delivered. If the logging function is not enabled, the request is not retained.
After TorrentSpy was sued, the question arose about whether or not the information NOT regularly collected by TorrentSpy – the information in RAM – constituted Electronically Stored Information subject to both discovery and what is called a litigation hold. Under a litigation hold, once you become aware that information you may posess is relevant to ongoing or threatened litigation, you must suspend your document destruction policy and stop deleting that relevant information.
Electronically Stored Information is defined under the Federal Rules of Civil Procedure as "information that is fixed in a tangible form and to information that is stored in a medium from which it can be retrieved and examined".
The court rejected TorrentSpy's claims that the information in RAM was never "stored" since logging was never enabled, and that requiring TorrentSpy to enable logging amounted to requiring it to "create"; records that didn' exist. Certainly, the information in RAM was – for a brief time – stored at least transitorily, just as streaming media (like a VOIP call, or videoconference) is stored on your computer for the brief interval it is being displayed.
Thus, the information is (1) electronic; (2) stored; and (3) relevant. The consequence of this is that not only is the information subject to discovery under the TorrentSpy precedent, but the entity must then suspend its document deletion policy, which in the case of TorrentSpy was to delete information in RAM that it never stored.
Thus, when you learn of the possibility of litigation, you may have to START storing streaming media, contents of VOIP calls, contents of videoconferences, webinars, chats, instant messages, logs, scans, or other electronic records that you never stored before.
ISPs, Portals and Telcos
A similar issue arises with respect to information held by Internet Service Providers (ISPs), web portals like Google, Yahoo and Microsoft, and telephone companies. These entities routinely collect massive volumes of data about their clients and customers – including things like search requests and results, IP history information, logon information, services utilized, date, time, source, destination, and duration of calls.
VoIP providers or ISPs may also store the contents of voice or video communications temporarily as a consequence of transmission of the packet network. Remember the adage – if it exists, it is discoverable.
Now there are legitimate reasons for companies to want to collect, store and use at least some of this information. There are business models based on the analysis of this information. Load balancing, billing, and even selling this information are all legitimate uses (provided that the consumer has some awareness that this is going on.) What is important is that the provider – the telco, the ISP or the portal – decides what information is going to be collected, how it is going to be used, whether it is going to be stored (and for how long) and then communicates these facts to the consumer.
There has long been a debate over how long these entities will retain the records, and what they will do with them. The Department of Justice and the FBI has long been seeking authority to require ISPs, Telcos and others to retain log data and other data at their own expense, "just in case" the information might later become relevant to some investigation.
European countries have also been engaged in the same dialogue. If the records are retained (even when there is no business reason for keeping them) the records become discoverable – by grand jury subpoena, FISA or Title III wiretap orders, National Security Letters, or by voluntary cooperation by the ISP or subject. They also become available in any other litigation – copyright infringement, defamation, or routine divorce cases.
Since the ISP or portal would generally be a third party with respect to the underlying litigation, they might not be mandated to create or permanently store log or other transitory information, but that is not entirely clear. What is clear is that the government wants companies that create electronic data to keep it "just in case".
Indeed, ABC News reported that the FBI, in a Department of Defense authorization bill requested a grant of $5m to pay telephone companies to store information such as call records, and to develop a method of retrieving such information at the request of law enforcement. As reported by ABC News:
The $5m project would apparently pay private firms to store at least two years' worth of telephone and Internet activity by millions of Americans, few of whom would ever be considered a suspect in any terrorism, intelligence or criminal matter. The project would involve "the development of data storage and retrieval systems...for at least two years' worth of network calling records," according to an unclassified budget document posted to the FBI's Web site.
So instead of warehousing the records themselves (and with no legal authority to subpoena ALL records), the government is essentially issuing a document preservation request to the telephone companies, requesting that the records be kept by the telcos for two years, and agreeing to pay all or some of the cost of doing so.
Effectively, this makes the telephone companies into the warehouses for the government and for anybody with a subpoena. Note that there is nothing wrong with the phone companies keeping these records for their own business purposes, but now they will be keeping them presumably just in case.
Web portals like Google, Yahoo! and Microsoft learned the lesson of the adage that if records exist they will be subpoenaed when, in the context of defending Congress' anti-smut statute, the government subpoenaed (in a civil lawsuit) massive volumes of data about how people used these portals, what they searched for, and what was ultimately delivered.
As a result of this, and of the document retention requests by law enforcement and regulators, all of the major portals have voluntarily agreed to anonymize their records after a period of time – Yahoo! for 13 months, Google and Microsoft for 18 to 24 months.
Ask.com went further, offering a service called AskEraser which it claims would allow for anonymous web surfing, and where "the company claims it will not retain the search histories of customers who opt in for the AskEraser".
Which brings us back to where we started. Just because you promise NOT to collect or retain records, doesn't mean that you won't be required to collect and maintain them. Even if you don't have technology readily available to capture data streaming through your network, if the information is stored there briefly, you may be required to capture it.
Sure, you can try anonymizing technologies, but these usually work by NOT LOGGING data, which as we learned with TorrentSpy doesn't always work. What we need is a commonsense approach to what really is a record that is stored by a company, as opposed to log data which COULD be stored by a company.
This article originally appeared in Security Focus.
Copyright © 2007, SecurityFocus