TfL hackathon showed data can keep transport running and people safe
Analytics is about the journey AND destination
Sponsored If software is eating the world, then hackathons are its fast-food restaurants. Groups of developers come together for short periods to try to solve pressing problems. This happens in sectors from healthcare to retail, and now it's happening in transportation too.
London, the UK's capital, is a city groaning under its own weight. Its road network supports roughly 21 million trips each day, accounting for around 80 per cent of all trips in the city, according to Transport for London, the local government body that manages its transport network. Buses carry 6.5 million people each day, picking their way around 500,000 roadwork jobs each year.
Things are getting more challenging over time. There will be around 10 million people in the city by 2030, up from around 8.6 million in 2015, travelling on road and tube systems built years ago to cope with far lower traffic volumes. In 25 years, the system will have to support another five million daily trips, warns Lauren Sager-Weinstein, TfL's chief data analyst.
TfL has turned to analytics to solve the problem and spends a lot of its time gathering data and turning it into useful insights.
Data, data everywhere
"For example, the 19 million daily transactions that we see through our ticketing system feed into models that our transport planners use to forecast future demand on the network," she says, adding that the introduction of Oyster in 2003 was a huge leap. Before that, it had been gathering passenger journey data using paper-based surveys.
"When we combine this ticketing data with other data sets, such as bus location data, we use this to plan our bus network – at the route level and even at bus stop location level."
TfL has also embraced the open data movement, publishing a variety of feeds via its own unified API, available to the public. Apparently, developers quite like it. Maybe a little too much.
Anyone can access live data on transportation and related information, ranging from live bus arrivals, through to train status on the tube, and live traffic disruptions. It produces a lot of these data using its connected Split Cycle Offsets Optimization Technique (SCOOT) system, which uses embedded road sensors to see how traffic is flowing on the street, and what emissions are like.
Alongside this live data, you can also get access to more static information in the form of structured data feeds, ranging from Oyster ticket stop locations through to Wi-Fi access points on the tube, and even walking times between adjacent stations. The latter might make stunts like this one far easier, incidentally.
How can TFL turn all that juicy data into something actionable that helps it solve problems such as managing road traffic, or keeping trains on the line and at capacity? Could it use data to maintain or improve air quality?
Hack to the future
Last September, the organisation hosted its first hackathon to try to solve some of these problems. Anyone could enter the contest, which TfL held in conjunction with Amazon Web Services and Geovation, a London-based innovation hub owned by Ordnance Survey that helps entrepreneurs take location-based technology ideas to market.
As part of the event, TfL organized "surgeries" with subject matter experts.
"It was a great opportunity to test new data, have a close dialogue with developers (we have since launched a tech forum) and for TfL to promote the data that is available," says Rikesh Shah, lead digital partnership manager for TfL.
There were a mixed bag of teams at the event, explains Alex Wrottesley, who heads up Geovation. "There were some very new, early stage hackers and then some more developed companies," he says.
One of these more established firms was WSO2, a large Sri Lankan middleware company that has seen multiple rounds of investment from Intel's VC arm, Intel Capital. WSO2 focuses on analytics and big data solutions as part of its integrated application platform for businesses.
WSO2 won because it pulled together a solid, functional prototype from scratch in a short time, says Wrottesley, which was one of the things that impressed him as a judge for the event.
WSO2's product used a map to explore the crowding level at different underground stations on the network, Wrottesley says, articulating its key benefit: "Looking at a map, and knowing in real time what the on-platform capacity and congestion would be for different stations, and how it would impact your transition through the system."
Mapping Londoners' journeys
The WSO2 team used the firm's own Complex Event Processor, an open source back-end analytics engine that identifies and prioritises real-time events based on underlying details such as latitude and longitude. It uses an SQL-like language to process queries.
WSO2 poured a combination of data feeds into the CEP to solve this problem. First, it consumed historical data on passenger numbers to predict how many people would be at a station at a given time, and plotted this on Ordnance Surveys OS Maps.
The company analysed passenger data to understand the flow of people through various stations from entrance to platform and platform to exit. It could even predict passenger flow between platforms, it said in its description of the project.
It combined this with sensor data from TfL's SCOOT network to understand current traffic flows, and overlaid data about roadworks currently in play.
Then it analysed this data using random forest classification. This approach to machine learning uses hundreds of decision trees, each taking a random sample from the data set. It then collates the results from each tree, aggregating them to produce a predictive result.
Random forest analysis is a model that lends itself well to machine learning, which played a big part in the WSO2 project and is having a wider impact on analytics, according to Senaka Fernando, director of solutions architecture at WSO2.
"Predictive analytics (aka machine learning) can make predictions helping organizations and the public to be prepared for the future," he says. "For example, in the case of TfL, it's very important to understand catastrophic incidents that have just happened."
Using analytics on the ground like this can help transport managers to take action quickly and help the public plan their way around everything from criminal incidents through to accidents and natural disasters.
In WSO2's case, the machine learning model predicted traffic five or ten minutes into the future, and got it right 88 per cent of the time, according to the company. That enabled it to recommend the best route to a destination for commuters.
The project also included air quality data across London, using a feed from King's College London accessed via the TfL unified API. WSO2 used that to recommend the best walking and cycling routes across Greater London.
Travelling by the numbers
Wrottesley sees other uses for analytics in transportation over time. One idea involves using data to help shape passenger behaviour and spread the load more evenly across the system.
"Most of our transport networks are massively underused most of the time," he points out. "Are there innovative pricing models? Are there different ways to use technology and communications to help people stagger their departure times and reduce peak time impact on transport systems?"
This is all good stuff, but the products that hackathons create don't always get used in their final form. Most of the also-ran entries in the Hack Week were listed as ideas or prototypes. There are some interesting concepts in there, though.
One of them proposed a reader on the side of buses to tell passengers how crowded they were, while another posited a cycle hire solution linked to a demand-driven market that would encourage particular journeys. Yet another used gamification in journeys to get people to travel along less congested routes, and maybe do a little more walking.
Shah liked the idea of voice-controlled travel updates, which was another idea floated at the event. "The conversational user interface arena is interesting so hearing devices like Amazon Alexa promoting real-time bus information was good," he says. "I would like to see more adoption of new hardware using our open data to find innovative ways to engage with our customers."
The WSO2 prototype didn't go forward in its hackathon form, says Wrottesley, but it did enable WSO2 to demonstrate its expertise to TfL.
Shah says that the process still delivered TfL a lot of value. "Because the station business data was a snapshot which we tested with developers, the feedback from the hackathon helped us to develop more enhanced data which we recently launched," he says. "We are hoping to re-engage with app developers to use this new data."
This isn't the first time that TfL has dabbled with data. A couple of years ago it had the temerity to meddle with Londoners' habit of standing on the right side of the escalator and leaving the left hand side free for walkers. According to TfL beancounters, the numbers showed that by standing on both sides, 40 per cent more people could get up the stairs at the same time.
It tried this for three weeks, enforcing the policy with a mixture of pleas to passengers, and secretive plants who would stand on the left side. It worked, in spite of loud protests from angry Tube riders. And as soon as the trial stopped? Everyone went back to standing on the right again. Which just goes to show – analytics is great from 50,000 feet, but there are some numbers that Londoners will always find difficult to swallow.
Sponsored: Becoming a Pragmatic Security Leader