Feeds

Microsoft finally cuts Bing data retention time to six months

Anonymise this!

Choosing a cloud hosting partner with confidence

Microsoft has finally slashed the amount of time it keeps some online search query data to just six months, over a year after it declared it would make the change if the likes of Google and Yahoo! agreed to play ball.

The company’s privacy chief Peter Cullen said late yesterday that Microsoft planned to implement the changes to its data retention policy over the next 12 to 18 months.

“We will delete the entire Internet Protocol address associated with search queries at six months rather than at 18 months,” he said.

“This new and significant step will be incorporated into our existing privacy practices, which already provide strong protections for Bing users.”

In December 2008 Microsoft said it supported the Article 29 Working Party’s guidelines for anonymisation on the web, before adding that such rules could only be adopted if they were introduced industry-wide.

The Article 29 Working Party is a group of European Union bureaucrats who have been pushing to get search engine firms to purge their user records after six months.

Under Microsoft’s previous policy, the software vendor claimed it took steps to “de-identify” the data by cutting it loose from account information that could uncover the person who performed the search in Bing.

However, the remaining data were left to languish online for 18 months before MS droids finally deleted the IP address, dumped the de-identified cookie ID and any other cross-session IDs associated with the query.

Cullen said Microsoft had no plans to change the fundamentals of that policy. However, the firm will start to delete IP addresses associated with Bing search queries after the data has been available online for six months.

All of which isn't a million miles away from Google’s current lukewarmish approach to anonymising an individual’s search data online.

Redmond will similarly leave the de-identified cookie and cross-session IDs intact, but after 18 months it claimed it will suck all the data out of the intertubes for good.

In September 2008 Google agreed to half the amount of time it retained IP addresses and user data garnered from search query logs.

At the time, the internet kingpin said it would anonymise IP addresses on its server logs after nine months “to address regulatory concerns to take another step to improve privacy for our users”.

But Mountain View later admitted to El Reg that it would only "change some of the bits" in the user IPs stored in its server logs, while leaving the all-important cookie data alone.

“There are many good reasons to retain and review search data. Studying trends in search queries enables us to improve the quality of our results, protect against fraud and maintain a secure and viable business,” said Cullen, who echoed Google’s previous justification for keeping the data online.

“But consumer privacy can and must be preserved. For our part, Microsoft continues to examine our practices to ensure we strike the right balance and achieving [sic] both goals.” ®

Beginner's guide to SSL certificates

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.