From landslide to buried alive: Why 2017 election forecasts weren't wrong
Swing when you're winning
In the aftermath of almost every recent election, two types of story get written based on the outcome. One is how the polls "got it wrong", how the forecast – surprise! – failed to match the actual result. The other, usually written by someone with at least basic statistical skills, explains why the polls mostly didn't "get it wrong."
Yet neither take is particularly helpful: both are pretty wrong in their way. Because the real issue, rarely addressed, is that the polls are simply incapable of delivering the sort of predictive accuracy that the general public and far too many in the mainstream media demand.
There are many reasons why this is the case. Polling methodology, changing demographics and modelling issues are key, as well as issues inherent in the system by which we have chosen to select our elected representatives. The basic principle underlying statistical forecasting, or estimation, is that it uses sample-based observation as a means to reduce the margin of error in respect of future events. The larger the sample, the smaller the margin of error: likewise the closer the sample is to the composition of the population under study, the greater accuracy we can expect to any subsequent forecasts.
Let's start with the impact of sample size. The origins of forecasting lie in the 1950s concept of uniform swing. Back then, we in the UK were much more two-party than we are today – even after what has been widely recognised as a highly polarising election. It was easier to call elections based on swing, because, as elections guru David Butler was arguing just two weeks back, if you knew the swing, you could simply apply it across all seats and Bob's your uncle – or even PM.
On election night, he suggested, once a dozen or so results were in and the inter-party swing for those seats calculated, you pretty much knew the outcome of the election overall.
So, while opinion polls came with the usual caveat (all too often ignored) that there was a margin of error of around 2 to 3 per cent for any forecast, pollsters were still focused on just one quantity – swing – that was mostly uniform across the nation.
This, though, was an era where not only might entire households be expected to vote as one, but individual voting preference, once determined, frequently stuck from cradle to grave. But, as recent results have shown, the outcome of UK elections is increasingly a series of related but discontinuous regional or local polls.
Compare Scotland, where General Election 2017 saw an electorate seeking to punish what some saw as an out-of-touch SNP administration – and therefore boosted the Conservative seat share – versus Wales, where there was less of a nationalist cushion and Conservatives not only failed to win any seats from Labour but actually lost three to the party.
Note also how we increasingly disaggregate the results post-election into a variety of demographic cohorts to explain not just the headline result, but individual seat outcomes. There is plenty of analysis still to do, but there is already good evidence that voting split along age lines: older voters stuck with the Conservatives; the young came out for Labour, which is why the loss of not just Nick Clegg's seat in the university town of Sheffield but also the Tory bastion of Canterbury should be unsurprising.
With swing no longer uniform, the models that best predict elections appear to be those that build up the outcome, constituency by constituency, from basic demographics – bottom up rather than top down. Over the last few years, these have had some success. But they, too, are limited, because they require you to know three separate and independent things:
- how different demographic groups are likely to vote
- their "voting propensity"
- the likelihood of each demographic to vote
It is not hard to create a tool for election forecasting. Just set up a spreadsheet of demographic breakdown by constituency, factor in party propensity by demographic, and then play games by creating scenarios based on percentage voting by demographic. If 40 per cent of young people vote, the following constituencies will be Labour/Tory. But if 60 per cent vote, then we get this. It's great geek fun, but near useless for serious forecasting.
That is because polling is pretty good at getting to voting propensity, less good at extracting relative propensity coefficients – which demographic attribute is most likely to act as a vote driver: do young women vote as women or as young people? – and close to useless at estimating voting likelihood. Rather, what most polls do is compensate for the fact that different demographics will turn out differentially by applying a series of correction factors and weightings to factor in propensity to vote.
And their evidence for the correct weightings to apply? Past elections provide a guide of sorts, but where something significant has changed – as clearly it did during General Election 2017 – what we are really talking about is circularity and educated guess. Pollsters think they know what turnout by group will be, and so they weight their models accordingly.
That, though, is just the beginning of their woes. When building models – no matter whether for the likelihood of critical component failures or the likelihood of credit default – an additional stage, known as validation, is required.
That means splitting the sample into two. One part of the sample is the modelling set, used to estimate key parameters like "propensity to vote Tory", and that set is used to build the initial model. The model is then tested by applying it to the data set originally excluded from the model-building process.
This works well for estimating factors such as credit default propensity in financial services where we have good data on who has defaulted. It cannot work for elections where the result is not known until after the thing you are trying to model has happened.
Then there is the fact that polls are no longer estimating single quantities (swing), but attempting to nail down multiple parameters (voting propensity and likelihood by demographic) and the margin of error on each is correspondingly widened. Add, too, voter volatility: the poll of polls may have been wrong, but polls on the eve of the election, from the likes of Yougov and Survation, weren't too far off the actual result.
So was everyone else wrong? Perhaps. As likely, though, is the possibility that previous polls were accurate on the day they were taken, but voters changed their mind. How would we ever know? Aggregation is an attempt to get around the problem that we need larger polling numbers just to get back to the level of accuracy we once had when estimating a single parameter – yet if the electorate is still making up its mind, aggregation is, itself, a distorting factor.
Last but far from least is the distorting nature of First Past the Post. I have dealt with this at length before, so no need to rehash the detail, other than to observe that there is no poll in the world, other than the actual one, that can forecast a majority of two votes (North East Fife) or margins of less than 0.1 per cent of the vote (Perth, Newcastle, Southampton Itchen).
Whether such results indicate a clear "will of the people" is an interesting question, but one that belongs to the realm of politics rather than psephology.
So did the polls "get it wrong"? No, at least not if you understand what polls can and, as elections grow more complicated, cannot do – which is provide anything more than an approximation of the outcome. They are, as The Register has argued previously, a useful tool for informing scenarios. That is all.
The real issue is not the polls, but mainstream media, which continues to make the massive category error of depicting polls as capable of more than that. They aren't – and it is past time we understood that. ®
Sponsored: What next after Netezza?