Original URL: http://www.theregister.co.uk/2005/03/03/google_autolink/
Google AutoLink: enemy of the people?
Don't be evil ... mostly
Success sometimes makes people do funny things, things that may seem bizarre, childish, or even foolhardy to others. To those undergoing the new brush with wild success, however, their actions make complete sense. For instance, several years ago a retired former construction worker named Phil Lee won the lottery in British Columbia, Canada. It wasn't an enormous amount of money as lotteries go - only about $100,000 Canadian, or about $76,000 American - but it was a good amount for Phil.
When asked how he was going to spend his money, Phil tossed out several ideas. First of all, he thought of others: he was going to give some money to his family. Next, he thought of himself, which no one could begrudge, by pledging to buy some decent walking shoes and a new set of false teeth. For a man in his 60s, those were good purchases.
But it was Phil's final plan for his money that made me remember him. He said that he was going to use some of his money to buy a tombstone - an extra special tombstone that would have engraved on it the words "Been there, done that", and quite a series of pictures: "a champagne glass, a royal flush, a slot machine, a nude woman facing backwards and a stick of dynamite with a lit fuse." Wow! Phil, I salute you!
Since I last took a good look at Google, a lot has changed for the search engine. It's gotten bigger, for one thing, with new hires, an almost continual rollout of new services, and an IPO worth billions of dollars. It's also facing increased competition now, mainly from Microsoft and Yahoo, but from others as well.
Unfortunately, the bad guys are making more use of Google now than they ever have. My last column on Google was about the practice of using the search engine to find vulnerabilities: files or data left exposed accidentally, exploitable bugs in software, or open holes in networks. At the time, I made a suggestion: "A couple of websites have even sprung up dedicated to listing words and phrases that reveal sensitive information and vulnerabilities. My favorite of these, Googledorks, is a treasure trove of ideas for the budding attacker. As a protective countermeasure, all security pros should visit this site and try out some of the suggestions on the sites that they oversee or with whom they consult. With a little elbow grease, some Perl, and the Google Web API, you could write scripts that would automate the process and generate some nice reports that you could show to your clients. Of course, so could the bad guys ... except I don't think your clients will ever see those reports, just the end results."
Well, it looks like a black hat out there somewhere is doing exactly what I warned about. At the end of last year, a new worm targeting the widely-used (and generally excellent open source software package) PHP Bulletin Board, or phpBB, appeared. The worm would find an installation of phpBB with an exploitable PHP flaw, take over the site, delete all pages built with HTML, PHP, ASP, and JSP, and then replace the text of the site with "This site is defaced!!! This site is defaced!!! NeverEverNoSanity WebWorm generation X." "X" would actually be a number representing how many generations down from the first infection this particular infection was. In some cases, researchers found the 24th generation of the worm. Ay yi yi.
But what was really interesting was the means the worm used to spread: Google. After an infection occurred, the Santy worm, as it came to be named, would search Google using search phrases that unearthed phpBB installs that had failed to upgrade to a patched version of PHP, and would then target those versions of phpBB, to devastating effect. It took Google over ten hours to block that query, ten hours that resulted in 40,000 sites defaced. It appears that Google has some work to do in the area of security by making itself open and available to those who find problems like the Santy worm so that Google can more quickly, and even proactively, act to stop other worms and nasties that use Google before they spread widely.
Of course, Google is not solely the target. Santy.B uses Yahoo and AOL Search to find vulnerable phpBB installs, while Santy.D uses the Brazilian Google. It's just that the main Google site is still the most used search engine in the world, and it's certainly the one with all the buzz.
A worm isn't the only interesting use of Google as an unwilling participant in criminal activities. In addition to vandals like those that went after phpBB, thieves are also using Google to bilk money out of unsuspecting users. Until now, phishing has primarily been a problem associated with email: you get an email purporting to be from PayPal, eBay, or your bank asking you to update some info, you click on the link and end up at what appears to be an official site, you enter the sensitive personal info and hit Submit, and a criminal now knows your username, password, credit card number, and other data.
Now it appears that phishers are bypassing email and counting on Google to bring them victims. A bad guy sets up a fake e-commerce site and waits for Google to index it. A patsy types a query into Google - let's say "beanie babies" (yes, people still collect 'em!) - and ends up at a real-looking site selling those goods. Under each thumbnail image is a link to "View larger image," so the patsy clicks that link. Instead of an image, a self-extracting Zip file installs a Trojan horse on the patsy's PC, and then we're off to the races. Or the site allows users to "buy" the beanie babies (or whatever else is hot right now) and pay via credit card, but never sends the goods. In both cases, the patsy is in a world of hurt.
As I said in the last article, I'm not blaming Google; I just wish they would work more actively to help weed out some of the issues raised above. I rely on Google to get my daily work done, and I think they generally try to adhere to their corporate motto: "Don't be evil." Generally. But now I'm really starting to have my doubts when it comes to privacy and the rights of writers and publishers. I'm beginning to wonder if Google is letting success go to its head. It's a debate that I'm having in my own head, based on the debates that have been occurring in the blogosphere over the past couple of weeks.
My worries actually began a few years ago, when I first found out that Google's website cookie doesn't expire until 2038 (yet another good reason to periodically clear out your cookie cache). With the beta release of Google's toolbar a few weeks ago, the debate has really ratcheted up to new levels.
Google introduced a new feature with its Windows/IE toolbar: AutoLink. When a user presses the AutoLink button on the toolbar, new links are created on the current web page, including the following:
Book ISBN - Amazon.com
Address - Google Maps
Car license plate number - Carfax
Package tracking numbers - UPS or FedEx
Now, there are a few things critics of AutoLink have ignored. For instance, it's not enabled by default. A user has to push the AutoLink button every single time they want to enable its use on a page. Further, current links are not overwritten; only unlinked text is affected. Even so, I'm really torn about AutoLink. I use other services that rewrite the code of a web page in certain ways to benefit me, like the BetterSearch Firefox extension, for instance, which (ironically enough) rewrites Google results to display a thumbnail image of each search result's home page. And I'm a enormous, grateful fan of the Adblock extension, which allows me to remove advertisements and other annoyances from websites. I could go on. In all those cases, I'm in essence rewriting content on a web page that I'm viewing, which one could also argue is what AutoLink does.
But I'm growing convinced that the problems AutoLink brings up are greater than the benefits. I'm not alone in this: plenty of others, like Dave Winer, Robert Scoble (at Microsoft, no less), Danny Sullivan of SearchEngineWatch, and noted web designer and developer Jeffrey Zeldman are criticizing AutoLink as well.
Google's relationship to the web reminds me of those old SAT analogy questions; in this case, it would look something like
Microsoft : operating systems :: Google : websearch engines. Back in 2001, during the beta release of IE 6, Microsoft introduced a Smart Tags feature that acted much the same as Google's AutoLink (for an excellent, detailed analysis of the problems associated with Smart Tags, read Chris Kaminsky's masterful "Much Ado About Smart Tags"). There was quite a brouhaha, and Microsoft withdrew the feature.
Looking back, there are eerie similarities between Microsoft's Smart Tags and Google's AutoLink (not surprising, since the same guy created both), with two key differences: at least Microsoft's links were purple and dotted, making it relatively obvious that they were different than the normal links on a web page, while Google's links are blue and underlined, just like the vast majority of links found on most web pages. In other words, once AutoLink is pressed, the viewer will not be able to tell which links are put there by the page's author and which are put there by AutoLink. Granted, holding your mouse over the link and waiting for a tooltip to open will indicate that the link comes from Google, but I'm not sure how many users are going to do that. In fact, given the state of most web users' knowledge, I have serious doubts that they'd even understand what the tooltip's text meant in the first place.
I work hard in these columns to pick interesting, informative links that back up my statements, provide detail where I must be terse, or entertain with a sarcastic comment on my text. It's as much part of my writing as the words I use. In fact, in 2005, I would go so far as to say that for any writer using the web as a platform, links are in fact part of his or her writing. When Google changes the links on this web page, Google changes my writing, without any input from me, and for commercial gain that certainly doesn't benefit me, or SecurityFocus, the publisher of my columns. If I was an online bookstore, the fact that my ISBNs turned into links to a competitor like Amazon would make my blood boil. In essence, Google - and selected partner companies - benefit commercially from my work, and I see nothing for it. Alternately, on my web site, I provide a lot of stuff under a Creative Commons license, but AutoLink ignores it and commercializes things I do not wish commercialized.
On top of those objections, let's add one that should particularly resonate with SecurityFocus readers: privacy. Google's cookie doesn't expire until 2038; add onto that the data that the Google toolbar can gather about users, and you have a data mining tool second to none. This makes me very, very nervous. "Don't be evil"? How about "Don't be evil ... mostly. Kinda. Pretty much. Maybe."?
I've been thinking about it, and I'm going to keep Adblock for now, since its operation is completely dependent upon my actions - nothing gets blocked unless I explicitly enter a URL to block - and since I'm removing annoyances, not augmenting content. In between Adblock, which seems OK to me, and AutoLink, which isn't, is BetterSearch. BetterSearch does change the Google results page, but it's not changing the original content. Instead, it clearly adds an enhancement. However, this does beg the question: at what point does enhancement cross the line? Frankly, it's a notion that's still up for debate, and I'm interested in your take.
I hope Google reconsiders their actions. If they proceed with AutoLink, how long until Microsoft decides that it's OK to bring back Smart Tags to IE? And I have news for Google: a lot more machines in this world have IE on them than have IE plus the Google toolbar. If Google's not careful, we may one day be talking about another tombstone for another defeated company - and I somehow don't think that tombstone will be nearly as entertaining as the one that Phil Lee designed for himself.
Scott Granneman is a senior consultant for Bryan Consulting Inc. in St. Louis. He specializes in Internet Services and developing web applications for corporate, educational, and institutional clients.