Related topics
  • ,
  • ,
  • ,

World Cup stats fever - have you got the balls to win?

How to model your way to a result

No host nation has ever lost the opening game of the World Cup.

Now combine that nugget with knowing that South Africa is the lowest-ranked football team, by some distance, to have ever hosted the competition – the squad is 83rd in FIFA's rankings.

What do you think is going to happen when they face up to Mexico (ranked 17th) at 3pm (GMT+1) this afternoon in Johannesburg?

If you knew for sure you'd keep it to yourself and put some money on it at the bookies, preferably with a healthy wedge on the accurate goal score market. The starting line-up of each team will be key, as will other factors. The quest to forecast the results of this kind is part of man's long-held desire and ongoing research project to use history to quantify risk and basically predict the future. [For a look at some individual players, go here]

Out of the 32 teams in the tournament, betting-wise South Africa is 18th favourite (using the market of the betting exchange Betfair). There is only one team with a worse FIFA ranking in these finals, and that is North Korea, ranked 105th by FIFA and in last place by the punters.

So there are 13 teams whose historical performances would suggest they are better than South Africa (including Greece, Australia and Nigeria, FIFA ranked 13th, 20th and 21st respectively), but are all less fancied to hold the Jules Rimet Trophy aloft by people who are putting their money on the result. That's a big home turf booster affect, (but lets face it, compared with favourites Spain and Brazil, none of these bottom half teams is fancied much at all).

Because there's money to be made – in terms of gambling, but also in team management - there are businesses taking a crack at football forecasting, interpreting data, refining models and analysing which information is most relevant to act on.

Bettorlogic is a leading player in this market, and provides data to 350 bookmakers, football clubs, and betting syndicates. It employs statisticians and developers to work on its models and even Arsenal manager Arsene Wenger uses its player analysis tool to examine the impact of a player or combination of players on team performance (see part two to view how key World Cup players have performed for their clubs).

“He [Wenger] is looking at how individual players perform in different scenarios, against different kinds of teams, in different phases of the game, and whether Arsenal are winning, drawing, losing at that point in the game,” says CEO Mike Falconer.

There's no end to the number of data points you can grab in any football game – half-time and full-time score, assists, shots on target, completed passes, team possession, ground covered by a player – and certain fans love this level of number crunching, as they do in cricket and baseball.

Bettorlogic's experience is that it works better if you cut through all this. At present you'd have to grab all this data manually which would be impossible at scale and speed. The company's model is about goals, the time they're scored in a game and the rank of the respective teams – eg top five, bottom third etc. Player analysis compares the average number of points per game (PPG) gained by a team when a player features for his club with the PPG when he does not play.

This analysis revealed Chelsea's Florent Malouda to be their key player in the 2008-09 season, when many considered him a weak link. His contribution was recognised the following season when he was named the Premier League's Player of the Month in March 2010.

Sponsored: 10 ways wire data helps conquer IT complexity