Microsoft's Cognitive Toolkit on GitHub in all its speech-recognising glory
Now go forth – and develop armies of soulless stenographers
Microsoft has released a catalogue of AI software under Microsoft Cognitive Toolkit on GitHub today.
The new toolkit is an updated version of the Computational Network Toolkit, which was developed by a team of computer scientists interested in speech recognition and natural language processing.
It has since expanded into other areas. The 22 APIs cover computer vision, emotion recognition, web search and text analysis, and have been updated to be compatible with C++ and Python.
Microsoft's AI researchers have already used the toolkit to build an automated system capable of recognising recorded speech on the NIST 2000 Switchboard at a word error rate of 5.9 per cent.
Heralded as a "major breakthrough" in speech recognition, the system performs slightly better than the level needed to be a professional transcriptionist. Microsoft are keen to integrate the system into its AI assistant Cortana.
The sudden surge of capabilities in AI has actually been brewing away for over 20 years, Xuedong Huang, Microsoft’s Chief Speech Scientist and a developer of the Microsoft Cognitive Toolkit (MCT), told The Register.
A combination of large datasets, better computer infrastructure and deep learning – something Huang calls "the three pillars" of AI – has led to sudden and significant advances in the field.
In the past it could take up to two months to train a speech recognition model on a single GPU, but using the toolkit it only takes days. Huang credits Microsoft's improving "breakthroughs" in speech recognition to fast training times.
In September, Huang's team announced they had achieved the lowest error rate of computer speech recognition at 6.3 per cent. But a month later, the error rate has decreased to 5.9 per cent and the system has reached "human parity".
More than 20 years ago, the error rate was higher than 60 per cent. Now that AI is rapidly improving, it's important to 'democratise' the technology, Huang told The Register.
He hopes that developers will seize the opportunity to use the toolkit for research or to create new products with novel applications. ®