AI + ML

This article is more than 1 year old

Meet Clippy 9000: Microsoft brags about building Earth's largest AI language model, refuses to let it out of the lab

El Reg has also invented the warp drive but, ah jeez, we're told we can't show anyone. Shame. It's a tough one

Tue 11 Feb 2020 // 00:45 UTC

There’s a new giant AI language model in town: enter Microsoft’s Turing-NLG system, which apparently contains a whopping 17 billion parameters, making it the largest publicly known model of its class yet.

Turing-NLG joins the growing list of massive text-generating machine-learning systems. Google’s BERT model contains 340 million parameters, OpenAI’s GPT-2 tops out 1.5 billion parameters, and Nvidia’s Megatron-LM packs 8.3 billion parameters.

All of them are smaller than Turing-NLG, Redmond claimed on Monday. Like its rivals, you give Turing-NLG a writing prompt, and it uses this to generate what it predicts are human-like follow-on sentences.

“Microsoft is introducing Turing Natural Language Generation (T-NLG), the largest model ever published at 17 billion parameters, which outperforms the state of the art on a variety of language modeling benchmarks and also excels when applied to numerous practical tasks, including summarization and question answering,” Microsoft claimed.

Like its predecessors, the 17-billion parameter model is built out of transformers, an AI architecture that processes incoming text and output words in a parallel fashion that takes context into account. If the previous sentence or so was about the country France, this context is fed forward through the processing chain to that subsequent sentences are related to the Euro nation.

Meaningful or human-like text is tricky for machines to generate because there has to be an appreciation of context: sentences that flit wildly between subjects come across as nonsensical, so there has to be some kind of train of thought, no matter how artificial or vacuous it is.

Google says its latest chatbot is the most human-like ever – trained on our species' best works: 341GB of social media

Previous serial-like language models processed data in order, like an assembly line, and had to work hard to maintain a coherent theme in the output. Transformer-based models integrate context more directly in their training, and thus produce relatively high-quality prose, generally speaking.

Microsoft has been coy with regard to the technical details of Turing-NLG. When asked if researchers were planning to publish a paper outlining T-NLG, a spokesperson told The Register that "no additional papers [beyond the announcements] are anticipated at this time."

Researchers used “a Nvidia DGX-2 hardware setup,” and split the model across four V100 GPUs. A total of 256 V100s were required to train the hefty model on 174GB of internet-scraped text, we're told. For comparison, Nvidia’s Megatron-LM model spun up 512 V100 GPUs to churn its way through the same amount of training data.

T-NLG has allegedly achieved top scores for common benchmarks that test the ability for language models to generate coherent and complex sentences, answer questions, summarize text, apparently. It’s difficult to verify the numbers since Microsoft hasn't outlined the methodology used to test its system.

It has also withheld the model, too. A “private demo” has been given to a small number of academics for testing and feedback purposes. It’s possible that Redmond won’t release the model at all, but it hinted that the state of the art results provided “new opportunities for Microsoft and our customers. "We don’t have any details to share about future plans,” a spokesperson told us.

T-NLG can be used to improve Microsoft Office to read and summarize text documents, or in Outlook to assist users in writing emails, or maybe even in Cortana, Microsoft’s voice assistant chatbot. Who knows, maybe even Clippy, Microsoft's beloved virtual talking paper clip might make a comeback with AI superpowers or something. ®

Topics

Special Features

Vendor Voice

Resources

AI + ML

Meet Clippy 9000: Microsoft brags about building Earth's largest AI language model, refuses to let it out of the lab

El Reg has also invented the warp drive but, ah jeez, we're told we can't show anyone. Shame. It's a tough one

Google says its latest chatbot is the most human-like ever – trained on our species' best works: 341GB of social media

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Microsoft puts ex-DeepMind boffin in charge of London AI hub

Microsoft teases deepfake AI that's too powerful to release

AI gold rush continues as Microsoft invests $1.5B in UAE's G42

Industrial systems integrating digitalisation

Microsoft aims to triple datacenter capacity to fuel AI boom

Microsoft foresees a new type of AI PC: A Surface designed with help from machines

Why Microsoft's Copilot will only kinda run locally on AI PCs for now

Microsoft warns that China is using AI to stir the pot ahead of US election

Microsoft, OpenAI may be dreaming of $100B 5GW AI 'Stargate' supercomputer

Microsoft rolls out safety tools for Azure AI. Hint: More models

US, Japan announce joint AI research projects funded by Nvidia, Microsoft, others

AI spam is winning the battle against search engine quality

About Us

Our Websites

Your Privacy