Bishop Hill: Gonzo science and the Hockey Stick

Torturing the climate numbers until they confess

By Andrew Orlowski

Posted in Science, 8th February 2010 13:25 GMT

Interview In 2001 the IPCC published its Third Assessment report prominently featuring a graph that became "the logo of global warming". Previous historical reconstructions didn't show our modern warm climate as particularly anomalous. This was very different, and was hailed as a "call to action". Yet Michael Mann's studies were deeply flawed. Omit one or two proxies, for example, and the scary warming 'spike' disappears. Mann's model could produce hockey stick shapes using random data, such as baseball scores, or red noise. Critics alleged that Mann's choices of data and statistical tools all cooled the Medieval Warm Period, and emphasised late 20th Century warming.

A new book recounts how the 'Hockey Stick' model was created and more intriguingly, the political and institutional defence of the indefensible. (At one stage the Hockey Stick's defenders argued that trees on different continents had "teleconnections" with each other - a claim that wouldn't be out of place in a homeopathy brochure.)

Andrew Montford, a science publisher and blogger under the name Bishop Hill, has provided the storytelling to match the detective work and persistence of another blogger, Steve McIntyre, who dismantled the Stick. Andrew talked to us about The Hockey Stick Illusion and some of the key issues it highlights.

You say the IPCC needed the Hockey Stick to be true, and that's why it got such prominence. Was this the best they could do?

If you put any of the other major temperature reconstructions in place of the Hockey Stick I think it's true to say it doesn't look so frightening. They have a Medieval Warm Period that is within a whisker of the temperatures we have now. You can make them look scarier by overlaying the instrumental record over the 20th Century part of the graph - in fact, that's part of the reason why it looks scary. I show this in the book - in Briffa series [see below].

But [these reconstructions] wouldn't convince anybody there's anything going on there at all. So yes, they needed it to convince people; and they promoted it, they had policies to sell. And when it proved to be flawed they had to stand behind it, because otherwise, they'd look stupid.

Now to the story. The preparation of the original Hockey Stick turns out to be a small part of the saga. How the issues McIntyre raised were dealt with by the scientists and the establishment is the main story.

Yes, the book is a detective story of how Steve McIntyre worked out how the Hockey Stick got its remarkable shape, which was a combination of data that's not suitable and questionable statistical methodology. I didn't want to dwell on what happened upfront, it's about how the Hockey Stick met its demise, if you like.

So for the data: people might have heard of Mann's use of bristlecones, which are poor proxies for temperature.

Right, the theory is if you have the right trees you can use them as a thermometer - they'll grow more in a warm year than they will in a cool year. But not every tree. The theory is that trees at the northern limit or upper treeline on the sides of mountains will be sensitive to temperature. But they could be affected by water and nutrients. And there is some doubt as to whether there really is a temperature signal in the tree rings. It's certainly very noisy, if it's there at all.

And this is one of the things that comes out in the book. I spend quite a lot of time discussing the verification statistics: if the mathematical model derived from the tree rings actually is credible in statistical terms. Whether it looks as though the model can create temperatures of the past.

Keith Briffa's reconstruction showed no hockey stick

And this is something Mann went to great lengths, with the Hockey Stick, to keep from us?

Essentially statisticians have a range of measures of testing whether the numbers that fall out of their mathematical models match up to the real world. In the case of the Hockey Stick they used something called a verification period, where they reconstruct the temperatures for the second half of the 19th Century, based on the model, and compare that to the real temperatures, based on thermometers. That can be calculated using a range of measures, but the two that are important for the history of the Hockey Stick are R2 and the RE.

The R2 statistic is essentially a very standard statistical measure of how lines match up against each other, while the RE is a very obscure measure that is only used by climatologists, and has been very heavily criticised because its behaviour in different circumstances isn't really understood. It hasn't been studied by statisticians.

In the story of the Hockey Stick I spend quite a lot of time discussing a paper by two associates of Mann called Wahl and Amman, claiming they had replicated the Hockey Stick entirely. McIntyre did a lot of work looking at their study, and one of the things he concluded was that their model failed the R2 test. This meant it was not credible. So if Wahl and Amman's model was not credible, Mann's was not either.

McIntyre put a lot of pressure on the journal that published it to publish the data and methodology, and they eventually agreed. And it said the Hockey Stick passed that RE test.

There's another really amazing story of how McIntyre found it didn't pass the RE test at all, but there had been a rather dubious, ad hoc adjustment made to the procedure to make it look as if it passed the RE test. It really is the most extraordinary part of the story I think.

What did they do?

This is where it gets quite technical. With the RE statistic, the question is how high does it have to be before you decide your results are 'significant'. So you calculate a benchmark, which involves throwing random numbers into your model, instead of real data, to see how well that compares against the real temperatures.

What Wahl and Amman did in essence was not to take a random set of numbers, but a set of numbers that had been filtered in such a way that the benchmark for the RE became considerably lower, making the Hockey Stick look significant. This was a procedure that was new, no one had ever seen it before, and it defeated the basic object of the analysis, which was to compare what you got with what you could get with random numbers.

So the methodology remains secret, but Mann has gone on to apply the Hockey Stick to different situations. One of which is a Hockey Stick that didn't have tree proxies in it.

Mann has subsequently published new temperature reconstructions that all have Hockey Stick shapes, and these have been picked up by the media, and trumpeted as proof that the Hockey Stick despite its many flaws, was still OK.

There are many problems with these subsequent Hockey Sticks. The paper Mann did in 2008 is particularly famous because it was discovered that in one of the proxies he was using, which was a lake sediment series [varve] from Scandinavia, the uptick or blade of the Hockey Stick was due to agriculture disturbance of the sediment rather than anything climatic. But not only that, he'd got the segment upside-down anyway.

This is the famous 'Upside Down Mann'

His mathematical algorithm couldn't detect the fact it was upside down. It just said 'look there's a deviation from the normal in the 20th Century, that seems to match the Hockey Stick, therefore I have something that can predict temperature. It's rather ludicrous and has been picked up by people involved in lake sediment research and they said he's got it wrong.

Mann has never acknowledged it, and it's still in the literature. It will be quite interesting to see if these papers end up in the fifth IPCC assessment report, because that's the state of play.

You don't decide what you want at the end, then choose the data and methodology that gives you the 'right answer'.

It's quite emblematic of how climate science has been conducted over the past ten years. The Upside Down Data is fine anyway, we're told, because we all know it's getting warmer; it's fine to use upside data, because the data fits. It's blessed as an OK thing to do.

What amazed me in your account is the testimony by one climatologist, D'Arrigo, who told the Senate that cherry-picking is OK, that if you get "a good climatic story" (her words) from the data, then you can throw away the data that doesn't suit the results. Torture the numbers until they confess, you call it.

It's one of those things where you do a double take and ask yourself 'did she really mean to say that?'. And perhaps there is some explanation for what she said and we've all got the wrong end of the stick, but as it was reported, yes, she said cherry picking was acceptable. There are other people working in the same area who have said the same thing.

One paleoclimatologist called Jan Esper has said you can throw away the bits that don't give you the right answer - and it's an advantage 'unique to climatalogy'. Yes, they say it, and seem to believe it.

This isn't science, is it?

No, it isn't science. This is something Edward Wegman, the statistician who did a report into the science said - it's not science. The cardinal rule is you decide your methodology and data, and see what comes out at the end. You don't decide what you want at the end, then choose the data and methodology that gives you the 'right answer'.

Where does this leave the science? We've heard the science is sound, but as people look closer and closer they begin to question what that means. The 'science' appears to be theoretical. There's a theory that greenhouse gas emissions have feedback, multiplier effects, but it is just a theory.

My book concentrates on just one area of the science and finds the science is highly questionable. More importantly I think, it finds the processes and procedures are corrupt, and they're biased. And this is borne out by what we see in the Climategate emails.

Penmanship: put in context, modern warming doesn't seem so scary.

So if you stand back from it for a while and look at the big picture, you have this theory, and have a lot of biased and corrupt science you like, that says yes, the theory is true. The general public are going to look at that and say "Well, I don't want biased science to tell me it's true, I want unbiased science to tell me it's true".

Where do we go from here? I think we have to find a way where people who don't have vested interests in the result assess the science again. And I don't see how that can happen again under the auspices of the IPCC.

They seem to be corrupt from top to bottom. The whole process seems to be developed in order to refute sceptics, rather than get to the truth. They're not a body that exists to get to the truth. I think they've got to tear it down and start again.

Whether that's feasible, I don't know. The public may not wear it either way. I think public opinion is going to fall away quite quickly now.

The theory that CO2 will absorb heat is true, other things being equal. What we don't understand is that the other things are equal.

You can't see the science being conducted outside the IPCC, then? People forget it doesn't fund science, it doesn't fund original research, it's there to report on the research that's conducted. I'm not quite convinced of the argument that if we get rid of the IPCC the problem won't have got away - it'll be the same people at the same institutions looking for the same 'results'.

I think that's true. But if you have outside people writing the report, you might get a different answer. But whether that's feasible I don't know. The thing that struck me is the paleoclimate studies I've written about are mainly about statistics. But when a statistician looked at what had been done, he decided it was nonsense, it wasn't science, and it was biased.

So when you do get an educated outsider looking at these things maybe you can come up with an unbiased way of dealing with the questions. The question isn't going to go away. The theory that CO2 will absorb heat is true other things being equal. What we don't understand is that the other things are equal. It won't go away.

One thing that resonates with Register readers is that in the private sector, someone making such dubious claims would be shot down, as you point out, because the consequences involve other people's money. This wasn't the case with the state and UN and EU funding for climate science - where anything goes. It doesn't have the same rigour. The irony is that people say 'business automatically corrupts scientists'.

My way of looking at that would be whether you can ever get a publicly funded scientist to give you an honest answer. Because if a publicly funded scientist says their area is a problem, they'll be a better-funded public scientist! How do you escape that when the bulk of your science is coming from the government?

That's not going to change in the near future. The only way is to have outsiders looking at what they're doing. Whether that's bloggers, or independent researchers, or maybe even somebody within the state machine, I don't know.

The fact is the people who are there have vested interests and that has to be recognised.

Isn't that quite a cynical point of view? It's almost like saying the state shouldn't fund the police at all, because they're just going around creating crimes, and creating jobs for themselves.

I think you have to design systems of control and audit that recognise that interest. You can't put scientists up on a pedestal and assume they are these machines that operate without those normal human pressures on them. If they see money at the end of it, there will be a temptation to follow that money. Not all of them, but some of them will. You can't trust that the advice you're getting as a policy maker is unbiased.

There's a dynamic though, isn't there. Politicians look for something to do, to be busy; the media looks for scares; the scientists provide the justification for the scares - it's really a three way relationship, and we've seen it with health issues and other environmental issues too.

Like BSE, yes. If I was a policy maker I would be a bit nervous about some of the advice I was getting. Particularly when people do want to cover their backsides. It's safer to say there's a problem, because then your backside is covered and you'll get more funding.

So is ensuring there's a plurality of funding sources the answer?

Obviously, that would be where I come from. I'm a libertarian and yes, I don't think governments should be involved in funding science. But it's not going to happen in my lifetime. ®


This week Michael Mann was exonerated by his employer Penn State University, after an internal misconduct investigation. The investigators failed to contact McIntyre, Wegman or any other critics of Mann's work [PDF, 102kb] One of the two investigators admitted he had not read the Climategate emails at all. However, his work has been forwarded to a full investigating committee which will examine whether Mann is "undermining confidence in his findings as a scientist, and given that it may be undermining public trust in science in general and climate science specifically".