Microsoft has thrown open the doors to its AI Lab, a suite of beginner projects to help developers learn machine learning.

There are five different experiments that cover computer vision, natural language processing, and drones. “Each lab gives you access to the experimentation playground, source code on GitHub, a crisp developer-friendly video, and insights into the underlying business problem and solution,” according to Microsoftie Tara Shankar Jana on Tuesday.

The first one is is the DrawingBot. It teaches developers about generative adversarial networks (GANs), a popular type of neural network that learns to create similar content to the data it was trained on. DrawingBot uses the AttnGAN - short for attentional GAN - created by a group of researchers from Microsoft, Lehigh University, Rutgers University and Duke University.


AttnGAN generates an output image based on input text. DrawingBot focuses specifically on birds, and inspects the individual words in the inputs and learns to map the description to a specific region of picture of a bird.

For example, it will learn to pick up things like ‘red wings’ or ‘orange beak’ to build up the image. If the description sways too far from what it was trained on then AttnGAN can’t quite match it. So, it’s pretty good at generating realistic looking images of birds, but not ones with, say, two heads or gigantic dragon-like wingspans.

JFK Files is a less flashy. It uses Microsoft’s Cognitive Search and Azure Search together and applies natural language processing to search through documents for relevant information.

In the AI Labs example, Microsoft uses the files on the JFK investigation that have been declassified and are now public to demonstrate how it all works. It can pull things like names and dates out of the old documents, some of which are handwritten.

The third one on the list is style transfer. It uses the popular COCO dataset with over 200,000 labelled images. The objects in the photos are segmented and carried over when the style transfer is applied. It turns realistic photographs into mosaic-style images, or in a surrealist style ones.

It uses Microsoft’s Visual Studio Tools to train the model and developers can play around with the code written in Keras and TensorFlow and execute it on Azure Cloud using Nvidia GPUs.

The fourth is about text understanding. Machine Reading Comprehension uses the popular SQUAD dataset to provide short paragraphs for a model to read and it can answer questions. Microsoft uses its Reasoning Network (ReasoNet) to do this and hopes it can be used for enterprise data and to help customer service by answering any questions a customer might have about a specific application - a bit like FAQs.

Finally, there is the simulated drone environment AirSim. The goal is to pilot a drone around a soccer field that is littered with stuffed animals. Users have to write a program in Python to fly the drone around and identify all the animals.

The code can be run on a real drone. It needs to be exported to TensorFlow and then sent to Docker containers, before its deployed using Microsoft’s Azure IoT Edge platform run on a drone using an Nvidia GPU.

You can play with them all here. ®

