MS Research: in high gear on the road to nowhere?

But it might just be in some danger of finally producing something

Accompanying the announcement that Rick Rashid was officially taking over Microsoft's Research Group was an announcement that its natural language processing (NLP) "hits high gear". But if this is true, it's been a long, slow acceleration process. In nearly ten years of trying to develop applications in the speech and translation area, Microsoft Research hasn't managed to produce one significant released product. There was of course the grammar checker in MS Office, which must rank with Microsoft's big embarrassments, and some trivial work in Encarta queries. And for Microsoft to claim credit for natural language queries in SQL7 is enough to make steam come out of the ears of Oracle and IBM database developers. Microsoft even admits that "as our system matures, it should greatly increase the accuracy of these products". So there we have it: the present products are inadequate. Outgoing Research boss Nathan Myhrvold did once announce that he had sent some speech recognition work to the MS Office team, but it never saw the light of day. For years Microsoft's refrain has been that it would deliver speech recognition when the accuracy was whatever percentage it was that made that unachievable at that point in time. Microsoft has partnered with Lernout & Hauspie to be able to have a speech recognition add-on for MS Office, and even taken a 7 per cent share in the company and put a watchdog on its board. But wisely, L&H has kept its independence, acquired Dragon in the interim (which is reckoned by many users to have the best speech recognition package at the moment), and demonstrated capability approaching what President Clinton was talking about in his State of the Union Address in January: devices that can translate languages as fast as we can speak". Meanwhile, the best devices around are people called interpreters. Microsoft has been asking itself "Why is it taking so long?", and it's the right question. Microsoft has evidently shifted its focus from a speech recognition engine, where there is competition from not just L&H but also from IBM and Philips, to an attempt to understand human language in its NLP group, which has 30 people, managed by Karen Jensen. She was poached from IBM in 1991, along with Stephen Richardson, and is not credited with much in the public domain since her 1993 book, which is mostly about what she learned at the IBM Watson Research Center in the eleven years she was there. Microsoft is working with just seven languages: Chinese, English, French, German, Japanese, Korean and Spanish. Microsoft is not shy to mention just how much it is spending on research and development, but the truth is of course that most is spent on development and only a rather small proportion on pure research. There's a long way to go before anything useful is likely to emerge from the NLP group.

Sponsored: 10 ways wire data helps conquer IT complexity