Is that a phone in your hand – or a gun? This neural network reckons it has it all figured out
Real-time video search for weapons looking possible
Artificial intelligence has the potential to take over mundane, boring tasks such as driving, scheduling meetings and transcribing speech. Now there's another job that can be added to the list: detecting handguns in videos.
As the technology improves, it won't be long before police officers or security guards can jump straight to the scene they're interested in without having to rewind hours of footage to find the relevant part.
Researchers from the University of Granada in Spain have written a paper on training a convolutional neural network (CNN) to automatically detect firearms.
Surveillance companies have been interested in these sorts of capabilities for a while. Taser International, a major provider of police body cameras, acquired two computer vision companies earlier this year: Dextro and Fossil Group. These gobbled-up businesses are expected to help Taser produce software that can automatically search hours of video footage for particular things – presumably faces of known criminals, weapons, and that kind of thing.
Dextro claimed to be the first deep-learning biz to analyze live video in real time. Fossil Group focused on image and video processing. Both have been merged to create Axon AI – Taser's new AI arm.
It isn't clear how Axon AI will analyze huge swathes of data, but if it's planning to help cops look for weapons in body cam footage, it may be using an approach similar to that highlighted in the University of Granada's paper.
The research has the potential to be a nifty, highly sought-after bit of technology. Siham Tabik, lead author of the paper and assistant professor at the department of engineering at the University of Grenada, told The Register the team recently had a proposition from a company and are arranging to "meet and discuss it with them soon."
Freeze! Put your hands in the air
VGGNet, a large convolutional network built with 16 layers and which handles a whopping 144 million parameters, is trained as a classifier to recognize the common features of guns like their shape or color.
Only common types of handguns like revolvers, automatic and semi-automatic pistols, six-gun shooters, horse pistols and derringers were considered.
The best method uses the "region proposals approach," where researchers label the guns with a "bounding box" to highlight the location of the guns in each of the 3,000 images used for training. VGGNet can then search for a specific area to target its classifier algorithms without wasting computational power on any background pixels that aren't relevant to the gun.
The whole detection process for images of 1000 x 1000 pixels took only approximately 0.19 seconds – good enough to detect pistols in near real time, an important feature if it is to handle videos.
To increase the accuracy, the researchers wanted to minimize the detection of false positives – objects that at first glance appear to be guns but aren't. To do this, researchers fed the neural network thousands of images from different datasets to train the image to differentiate between guns and other objects such as mobile phones or pens.
Researchers then tested their model on seven selected Youtube videos. Most clips are scenes from popular 90s films such as The World is Not Enough, Pulp Fiction, Mission: Impossible – Rogue Nation and Mr Bean.
An example of accurate detection of four pistols in a video still
If the gun is highlighted by a box that reads over 50 per cent, it's considered a true positive – a gun was correctly identified. The detector provides very high precision for six out of the seven videos (above 60 per cent), and the number of false positives is low.
It's quite precise, correctly identifying guns 84.21 per cent of the time.
A closer inspection of the test videos shows the model is not quite sensitive enough to be used in real-life situations yet. It struggles when there is low contrast and brightness in the videos, or if the gun is moved very quickly, or if it is largely obscured by people's hands.
Image taken from Pulp Fiction. The model fails to detect two guns in the background (false negative)
Although the video clips are of low quality, the good performance suggests the method could be used for automatic pistol detection alarm systems, the paper said.
The researchers are looking to enhance the precision and robustness of their model by including frames where the guns are in motion to cope with rapid movement, and are also looking to expand to a wider range of guns.
In the future, police officers will be able to skip to the scene of the crime quickly without having to trawl through useless hours of CCTV footage. It may even keep people safer by preventing violence through early detection of weapons. ®