Identifying planets with machine learning, dirty AI searches, and OpenAI scholarships
Oh, and an amusing story about an AI medical chatbot
Roundup Hello, here’s this week’s AI roundup. There is new code to play around with for those interested in machine learning and space, a model that predicts hilarious search trends for sex site YouPorn, and another funny story about an ostensibly intelligent medical chatbot in New Zealand.
Hunting exoplanets with ML – The machine learning code that a Google engineer and an astrophysicist used to detect exoplanets has been published online.
Christopher Shallue, a senior software engineer at Google, and Andrew Vanderburg, a postdoctoral fellow studying astrophysics at the University of Texas, USA, discovered another planet lurking in the Kepler-90 system.
It was a special find. Not only was it spotted using a convolutional neural network, but it meant that the Solar System was no longer the biggest planetary system found so far. Now, it is believed that Kepler-90 also has eight planets like our Solar System.
Shallue explained in a blog post how the model uses data taken from NASA’s Kepler telescope to decide whether an object is an exoplanet or not. It uses the transit method: as a planet passes across its star during its orbit, it blocks out some light and the star’s brightness decreases.
A graph measuring the star’s brightness will have a dip signifying when the planet is in transit. The code introduces a “Box Least Squares” algorithm, which examines the “U-shaped” dip for planets and “V-shaped” dip for binary planets.
“Our work here is far from done. We’ve only searched 670 stars out of 200,000 observed by Kepler – who knows what we might find when we turn our technique to the entire dataset.”
He admitted that the model is “not yet as good at rejecting binary stars and instrumental false positives as some more mature computer heuristics.” But the hope is that by making the model public, other developers can help improve the model.
If you’re a TensorFlow whiz, you can have a go here.
Interpreting models – A team of researchers has published an article in Distill, a journal known for its deep dives on a particular topic in machine learning, describing techniques that make it easier for developers to visualize how a neural network arrives at a decision.
They study how the neurons in GoogLeNet, an image classifier, and its hidden layers work to detect objects in images by developing a “semantic dictionary.” It maps every neuron activation to visualizations in the image to find out where in the image they are fired, and sorts them by the magnitude of the activation. Using the example of a labrador sitting next to a tiger cat, they find that GoogLeNet focuses on the droopy ears of the dog to decide the object is a labrador and the pointy ears of the cat to classify it as a tiger cat.
The article is long, technical and detailed. Make sure to hover your mouse and drag buttons over the results to see all the cool visual explanations.
OpenAI Scholars – OpenAI is supporting people from underrepresented groups with a stipend and mentorship to study and work on a deep learning project.
It’s open to students studying the subject full time during the three months stipend period, from June 4, 2018 to August 31, 2018, who have a US work permit and are working in a US timezone.
AI and computer science in general are notorious for their lack of diversity. In a blog post, OpenAI explained that “diversity is core to AI having a positive effect on the world — it’s necessary to ensure the advanced AI systems in the future are built to benefit everyone.”
OpenAI will provide:
- A $7.5k/mo stipend for 3 months from June 4, 2018 to August 31, 2018.
- Each scholar will receive $25,000 worth of credits from Amazon Web Services.
- You’ll have a mentor who will provide at least an hour of mentorship via video call each week, answer your questions via chat/email, and work with you to design and execute a good project to stretch your skills.
- There will be a group Slack with the scholars and mentors. If you’re in the Bay Area, we’ll optionally provide a desk for you at the OpenAI office.
“While we hope that some of the scholars will join OpenAI (where we are actively working on internal diversity & inclusion initiatives), we want this program to improve diversity in the field at large.”
Last year, Rachel Thomas, co-founder of fast.ai and a professor of data science at the University of San Francisco, wrote about the diversity crisis in AI.
In her post, she said that OpenAI does not disclose diversity stats, but she believes the company is probably less diverse than other AI research hubs like Google Brain.
The applications are open now and will close at 11:59 PT on March 31st. You can apply here.
NSFW: Hardbore Yore, Girl Time Flanty, and wow – YouPorn, a free porn video site, trained a recurrent neural network to predict what it thought would be its most popular searches made by users when looking through skin flicks.
The results are great. Not great because they’re necessarily accurate, but great because they’re funny. YouPorn watchers are, apparently, most interested in “T’Challa & Shuri”, “asarian humlion” and “girl time flanty”. The first two terms are characters from the movie Black Panther. But what the hell is “girl time flanty?!”
We asked YouPorn a series of questions. A spokesperson ignored the technical ones, so we don’t know the specifics of the model or what data was used to train it. But the spokesperson did say the company was “interested to see, for fun, what [its] trained recurrent neural network would predict with respect to search terms”.
Here are a few of our favourites from the list: "german mom hour", "cock milking table", "big booble hoter french", "doot sex", "batish my yisel", "beaf buts compilation", "blow yo" and simply "wow".
You can see the full list here. (Don’t worry the link is SFW)
More Google image challenges – Google has launched another image recognition challenge, this time it’s focused on nature.
The 2018 iNaturalist Challenge is a species classification competition. Google is working with iNaturalist, an online community that shares pictures and videos of various types of plants, animals and insects to monitor changes, and Visipedia, a project that combines computer science and crowdsourcing, to identify various species of plants, animals, and fungi.
It’s a flagship challenge for developers attending the Conference on Computer Vision and Pattern Recognition (CVPR) later this year in June at Salt Lake City, Utah. The competition extends the previous iNat-2017 challenge, and contains over 450,000 training images sorted into more than 8000 categories of living things.
The challenge is trickier than the ImageNet challenge, which is more general, because there are relatively few images for some species – a problem called “long-tailed distribution”.
Yang Song, a software engineer at Google Research and Serge Belongie, a visiting professor of computer science from Cornell University, said: “It is important to enable machine learning models to handle categories in the long-tail, as the natural world is heavily imbalanced – some species are more abundant and easier to photograph than others. The iNaturalist challenge will encourage progress because the training distribution of iNat-2018 has an even longer tail than iNat-2017.”
The training data and pretrained image recognition models can be found here.
Behold, the sentient powers of Zach! – A Reg reader in New Zealand alerted us to an AI success story that was just too good to be true.
It involves Zach, an intelligent bot that helps doctors suggest the right medications and care needed for their patients. The technology is fronted by one, mysterious, Albi – short for Albicus – Whale.
Zach just needs a recording of what’s been said at the doctor’s appointment to work his magic. After that, you can simple ask Zach questions and voila, it will answer you.
Sometimes, apparently, there are a few spelling mistakes but other than that Zach is pretty good. Oh, and you have to wait 20 minutes or so for a reply...you know because it’s working so hard. And, this whole interaction is naturally all done via email, of course.
We don’t want to give too much away, so read it and have a good chuckle here. ®