Feeds

Yahoo! mocks Google Privacy Theatre

Less-nonsensical anonymization

Providing a secure and efficient Helpdesk

Analysis The privacy gap between Yahoo! and Google is greater than you think. It's not just that Yahoo! will anonymize user search data 6 months before Google anonymizes user search data. It's that Yahoo! anonymization is less nonsensical than Google anonymization.

Today, as we dutifully reported, Yahoo! said it would anonymize user search data within a mere 90 days (with exceptions for fraud, security, and legal obligations). It even agreed to extend this unprecedented policy to page views, page clicks, ad views, and ad clicks.

Of course, anonymization is a meaningless word. But it would seem that Yahoo!'s use of the term isn't nearly as misleading as Google's. When Yahoo! says it will anonymize log data, it intends to:

  • Delete the final octet of the user's IP address
  • Run the user's Yahoo! ID through a one-way secret hash and delete the last 50 per cent of the hashed identifier
  • Run the user's cookie identifiers through a one-way secret hash
  • Filter all personally identifiable information - such as credit card numbers, social security numbers, and non-popular names - from search queries

In its lust for targeted advertising and who knows what else, Yahoo! has stopped short of true anonymization: deleting IPs, IDs, and cookie info entirely. Recreating this data isn't beyond the realm of possibility. But at Google, recreation is trivial.

The Mountain View Chocolate Factory says it will - at some unspecified point in the future - anonymize user data after nine months. But it takes some additional liberties with the word "anonymize".

With its nine-month anonymiztion, Google intends to "change some of the bits" in the user IPs stored on its servers. But that's it. The plan would leave cookie data alone.

And that means IPs are easily restored.

Google may erase certain IP bits on your nine-month-old search queries, but those bits will remain intact on newer queries - and both sets of queries will carry the same cookie info. Recovering the missing bits on older data is one-step process.

After 18 months, Google does alter cookie data - in some unspecified way. And the company argues that users have the power to scrub their own cookies before then. "We have focused on IP addresses, because we recognize that users cannot control IP addresses in logs," the company has told us. "On the other hand, users can control their cookies.

"When a user clears cookies, s/he will effectively break any link between the cleared cookie and our raw IP logs once those logs hit the 9-month anonymization point. Moreover, we are still continuing to focus on ways to help users exert better controls over their cookies."

Of course, most users don't even know what a cookie is.

Plus, Google has not said it will disassociate search queries from your Google ID - required for using Google services such as Gmail or Google Docs and Spreadsheets.

In September, Google also said it might tweak its nine-month policies. But today, in the email, the ad broker provided no update. At the moment, it's unclear when Google will even begin its nine-month IP doctoring.

But the company wants you to know it takes privacy very seriously. "We aim to strike the appropriate balance between protecting our users' privacy and offering them benefits of data retention, such as better security measures and new innovations," it said.

It did not mention advertising.

Yes, Yahoo! is balancing as well. But the wounded web portal has gone significantly further than Google to protect its users from hacks, subpoenas, and, yes, national security letters. The rub is that Yahoo! handles about 20 per cent of US search traffic - and Google commands 70. ®

Beginner's guide to SSL certificates

More from The Register

next story
ONE MILLION people already running Windows 10
A third of them are doing it in VMs, but early feedback focuses on frippery
Sign off my IT project or I’ll PHONE your MUM
Honestly, it’s a piece of piss
Netscape Navigator - the browser that started it all - turns 20
It was 20 years ago today, Marc Andreeesen taught the band to play
Torvalds CONFESSES: 'I'm pretty good at alienating devs'
Admits to 'a metric ****load' of mistakes during work with Linux collaborators
Sway: Microsoft's new Office app doesn't have an Undo function
Content aggregation, meet the workplace ... oh
Do Moan! MONSTER 6-day EMAIL OUTAGE hits Domain Monster
Customers freaked out by frightful service
Ploppr: The #VultureTRENDING App of the Now
This organic crowd sourced viro- social fertiliser just got REAL
Return of the Jedi – Apache reclaims web server crown
.london, .hamburg and .公司 - that's .com in Chinese - storm the web server charts
NetWare sales revive in China thanks to that man Snowden
If it ain't Microsoft, it's in fashion behind the Great Firewall
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.