Microsoft's Google-killer arrives with a 'whuh?'
Better than Google, except the results stink
Google's executives might be sleeping a little easier this weekend after Microsoft unveiled its much-hyped new search engine. It's fast, slick, and comes with a raft of interesting new features: confounding some expectations as surely as it confirms others. In short, Microsoft has produced a search engine that's better in almost every way than Google, except for one: its search results are terrible. But let's start with the good stuff.
Incredibly, MSN Beta Search trumps Google for speed: it's an order of magnitude faster. Anyone who doubted that Microsoft could deliver a large scale distributed cluster, and that's probably most of you, will be surprised at the nippy performance (although the true test comes when the system has to scale under heavy loads, of course).
Microsoft has also made building complicated queries much more attractive than its rivals. Click on the "Search Builder" option and you get five additional fields which you can add, one at a time, the fifth being three gauges for altering the search term's topicality, popularity, and semantic accuracy. This puts all its rivals to shame, and makes Google's Advanced Search page look about as appealing as an Assembly Language manual. Microsoft's new engine also has a rough caching service modeled on Google's cache, but without the keywords highlighted in colors: one of Google's most subtle and indelibly useful UI features.
Microsoft has also been busy in other departments. It attempts to produce a natural language answer to something it thinks is a particular kind of question. What's the capital of England? Gives the answer: London, for example. It didn't fare so well with the question "How many mickle in a muckle?", but it's a start.
But MSN Beta Search falls down badly where it really matters: in delivering results with any relevancy. Like Google, it struggles to distinguish between a source query and an effect query. Searching for "John Leyden"+"blaster worm" and "John Lettice"+"Windows" returned a lot of prattle, but hardly any original articles. When a search is so specific, you're reasonably expected to receive source articles, you might think, rather than what people are saying about them. And this illustrates a fundamental blindspot that both search engine designers, and web-happy techno utopians both exhibit: they mistake the web for the world.
Fancy a quick search query?
You might want to try this experiment for yourselves. Imagine yourself in a foreign country with full access to Google or WAP, and a bar full of strangers. You need to find a good local restaurant, or a bar, or just something to remember your visit by. Who will give you the answer quicker, and who will give you the better answer - your immediate neighbors, or the computer network? Ten minutes with Google or WAP aren't going to deliver anything useful to you - whereas ten minutes interacting with a stranger might produce quite extraordinary and unexpected results, for this is where the world lives. As long as humans venture outdoors to socialize, computer networks will always come a poor second.
Equally, computer networks will continue to frustrate everyone except the kind of people who design computer networks. After so long smelling only your own shit, the whole world starts to smell tangy and brown, and both Microsoft and Google allowed themselves to indulge in this whimsy yesterday.
On Google's PR blog, we learned that Google's index had doubled overnight to 8 billion pages. (Where had they been keeping the new 4 billion pages all this time, you might well ask.)
"Together these pages represent a good chunk of the world's information, but hardly all of it," wrote Google's VP of engineering Bill Coughlan, in what might be the understatement of the century.
Precious little of the "world's information" is even written down. Much of is it encoded in enduring transmission mechanisms such as music, the visual arts, religion and myths, for example. And almost all of the stuff that is written down isn't ever going to be accessible through the public internet for very practical reasons. You can get some of this piped into your computer if you're lucky enough to belong to a local library, but that's because a consensual social mechanism has been invoked to bypass such restrictions. What the internet's public search engines are left to work with is a toxic wasteland largely characterized by the generation of real time noise - both private and commercial - and what the machines churn out in answer to our hopeful "queries" isn't of much use to the rest of us.
To technologists, the solution is obvious: it's either going to require either a technical fix, or some huge change in social behavior, the creation of a world where we're all moored to our computers twenty four hours a day, so making society conform to the limitations of today's machines. But we all know this isn't going to happen. Fortunately, there are better ways out of this conundrum.
Just as governments have realized that using collective, centralized bargaining power against large pharmaceutical companies is a great way of reducing the cost of drugs used by the population, so one day, governments will realize that collective social bargaining with copyright owners and database can help bring good quality information to the population at large. This is a win-win agreement that makes the copyright holders richer beyond their wildest dreams, and gives us high quality databases to which we could never have before been able to access.
If the cult of "information" is as important as technophiliacs tell us it is, we need to develop social mechanisms, not fancier search engines, to get us to the Holy Land. Don't look to the privatized information scavengers of the web for answers.
So all the while we were consumed with the "search engine wars", what we were really looking at was the "library wars". And whoever has the best library wins, in this case. ®
Sponsored: Beyond the Data Frontier