Resignations, retractions and the process of science

6 Sep 2011 by Gavin

Much is being written about the very public resignation of Wolfgang Wagner from the editorship of Remote Sensing over the publication of Spencer and Braswell (2011) – and rightly so. It is a very rare situation that an editor resigns over the failure of peer review, and to my knowledge it has only happened once before in anything related to climate science – the mass resignation of 6 editors at Climate Research in 2003 in the wake of the Soon and Baliunas debacle. Some of the commentary this weekend has been reasonable, but many people are obviously puzzled by this turn of events and unsupported rumours are flying around.

The primary question of course is why an editor would resign over a published paper. Wagner (who I have never met or communicated with) explains it well in his letter:

After having become aware of the situation, and studying the various pro and contra arguments, I agree with the critics of the paper. Therefore, I would like to take the responsibility for this editorial decision and, as a result, step down as Editor-in-Chief of the journal Remote Sensing.
…
With this step I would also like to personally protest against how the authors and like-minded climate sceptics have much exaggerated the paper’s conclusions in public statements. [UAH press release. Forbes article etc.]

He clearly feels as though he, and his fledgling journal, were played in order to get a politicised message to the media. A more seasoned editor might well have acted differently at the various stages and so he resigned to take responsibility for the consequences of not doing a better job, and, presumably, to try and staunch the impression that Remote Sensing is a journal where you can get anything published.

This was nonetheless a very unusual step. Many bad papers are published (some of which are egregiously worse than the one in question here) and yet very few editors resign over the way the process was handled. (In fact, I think this is unique – the resignations at Climate Research in 2003 were not of the editors involved in dealing with Soon and Baliunas, but the other members of the board protesting at the inability and/or unwillingness of the publisher to deal with the resulting mess).

But what makes a paper ‘bad’ though? It is certainly not a paper that simply comes to a conclusion that is controversial or that goes against the mainstream, and it isn’t that the paper’s conclusions are unethical or immoral. Instead, a ‘bad’ paper is one that fails to acknowledge or deal with prior work, or that makes substantive errors in the analysis, or that draws conclusions that do not logically follow from the results, or that fails to deal fairly with alternative explanations (or all of the above). Of course, papers can be mistaken or come to invalid conclusions for many innocent reasons and that doesn’t necessarily make them ‘bad’ in this sense.

So where does S&B11 fall on this spectrum?

The signs of sloppy work and (at best) cursory reviewing are clear on even a brief look at the paper. Figure 2b has the axes mislabeled with incorrect units. No error bars are given on the correlations in figure 3 (and they are substantial – see figure 2 in the new Dessler paper). The model-data comparisons are not like-with-like (10 years of data from the real world compared to 100 years in the model – which also makes a big difference). And the ‘bottom-line’ implication by S&B that their reported discrepancy correlates with climate sensitivity is not even supported by their own figure 3. Their failure to acknowledge previous work on the role of ENSO in creating the TOA radiative changes they are examining (such as Trenberth et al, 2010 or Chung et al, 2010), likely led them to ignore the fact that it is the simulation of ENSO variability, not climate sensitivity, that determines how well the models match the S&B analysis (as clearly demonstrated in Trenberth and Fasullo’s guest post here last month). With better peer review, Spencer could perhaps have discovered these things for himself, and a better and more useful paper might have resulted. By trying to do an end run around his critics, Spencer ended up running into a wall.

Of course, Spencer does not see this in the same light at all. His comments both before the publication of the paper and subsequent to the editor’s resignation indicate that he thinks that he is being persecuted by (unnamed) ‘IPCC Gatekeepers’ who are conspiring to suppress his results – he even insists that this was “one damn fine and convincing paper“. As well as straining credulity to the maximum, I find this both unfortunate and curious. It is unfortunate because this attitude makes it almost impossible for him to take on board constructive criticism, and given that none of us are perfect, there are many times when doing so is essential. It is also curious because there is no evidence of any grand conspiracy, just people disagreeing with and criticising his conclusions (which as a scientist, you really just have to get used to!). It was S&B’s desire to avoid dealing with that, that likely led them to a non-standard journal, whose editor very likely followed the authors suggestions for (friendly) reviewers, whose resulting reviews didn’t do very much (if anything) to strengthen the paper.

Reactions to this turn of events have been decidedly mixed (though falling along existing lines for the most part). A few people have (I think correctly) noted that the paper itself was of ‘minor consequence’ and does not explicitly claim anything much other than correlation analysis over a short time period isn’t going to constrain climate sensitivity, and that at first glance, there was a mismatch between models and observations in a particular calculation. The first claim is actually uncontroversial (despite what Spencer would have you believe), and the second turns out to be less interesting than it first seems (see Trenberth and Fasullo’s RC post). However, the media and blogospheric interest in the paper had very little to do with the actual paper, rather it was provoked by the over-exaggerated press release from UAH and the truly absurd piece in Forbes by the Heartland Institute’s James Taylor.

Roger Pielke Sr. has accused Wagner of ‘politicizing’ the situation by resigning, but this is completely backwards. The politicisation of the situation came almost entirely from Spencer and Taylor, and Wagner’s resignation is a recognition that he should have done a better job to prevent that. Statements from Ross McKitrick that Wagner is a “grovelling, terrified coward” for his action are completely beyond the pale (as well as being untrue, possibly libelous, and were stated with no evidence whatsoever).

The question has also arisen why the paper itself has not been retracted (and indeed will not be). However, that would be a really big step. I can only think of two climate science related papers that have been retracted in recent years – one was for plagiarism (among other problems: Said et al, 2008) and the other was because of a numerical calculation error that fatally undermined the reported results. There are of course many, many more papers that are wrong, mistaken and/or ‘bad’ (in the sense defined above) and yet very few retractions occur. I think (rightly) that people feel that the best way to deal with these papers is within the literature itself, and in this case it is happening this week in GRL (Dessler, 2011), and in Remote Sensing in a few months. That’s the way it should be, and neither resignations nor retractions are likely to become more dominant – despite the amount of popcorn being passed around.

About Gavin

118 Responses to "Resignations, retractions and the process of science"

Russell says

7 Sep 2011 at 6:21 AM

The disparity Gavin notes between S&B unexceptional technical premise, which he finds
” actually uncontroversial (despite what Spencer would have you believe), recalls another controversy in which ” the media and blogospheric interest in the paper had very little to do with the actual paper, rather it was provoked by the over-exaggerated press release”

In the original case , the press release and subsequent media hype stemmed not ” from UAH and the truly absurd piece in Forbes by the Heartland Institute’s James Taylor., but a full service inside the beltway PR firm, Porter Novelli , which provided the jaw dropping artists impressions that transformed an “actually uncontroversial ” claim – it does tend to be cooler in the shade, into a politically charged campaign to overthrow some of the basic assumptions of strategic policy at the height of the Cold War.

Denial is where you find it, and Steve Schneider may be spinning in his grave at the ‘nuclear winter’ controversy’s transformation from a locus classicus of the hazards of science by press conference and DIY peer review into a handy how-to guide for practitioners of media hype.

Instead of the chagrined editors of Foreign Affairs commissioning Schneider & Thompson to defrost Sagan’s apocalyptic gambit, we see Wagner stepping down to protest the shenanigans of a latter day Freeze Movement.
TimTheToolMan says

7 Sep 2011 at 6:46 AM

In Dessler’s recent paper, clearly the Focean term has a large component of DSR.

Why is Dessler justified in using a value of Rcloud of 0.5W/m2 and call it “energy trapped by clouds” when it appears that the term Rcloud for radiative forcing should include albedo effects resulting in a much larger value?
Russ R. says

7 Sep 2011 at 7:35 AM

@ Martin Vermeer,

First, thanks for your civil response… far better some others can seem to manage.

As I understand Figure 2, only the blue line is a reproduction of SB11’s results using CERES observations regressed on HadCRUT3 temperature data. The three red lines are Dessler’s own regression series using different temperature data sets (MERRA, ERA-Interim, and GISTEMP).

So there are two important issues here… the first is the convenient inclusion or exclusion of results to suit one’s view (which both authors have demonstrated albeit to different degrees), and the second is the sizable gaps still remaining between observations and most models’ outputs over two specific interval (2-4 months lag and 10-15 months lead).

I’d agree with D11’s point that SB11 was wrong to not include outputs from all 14 (or 13) models, and I would suspect this choice to exclude certain results was likely made to overemphasize the gap, especially if the excluded series looked the way D11 showed them (though I can’t be certain since SB11 and D11 generated their model runs in different ways).

On the flip-side of that, I see absolutely nothing wrong with SB11’s use of HadCRUT temperature data for their regressions. D11’s unexplained decision to include observations regressed on other temperature series would seem to have no real purpose except to narrow the gap. (Not as serious as excluding results, but still a form of cherry-picking.)

As for the gaps, as I pointed out already, specifically for lag intervals of 2, 3 and 4 months, absolutely NONE of the 13 model output regressions come within 2σ of the original SB11/HadCRUT observation series, and 10 of the 13 don’t come close to ANY of the 4 observation series (instead clustering in a range of 4σ to 6σ from the SB11/HadCRUT regression series).

A similar, though less pronounced gap exists among the lead regressions, looking specifically at the interval of 12-14 months lead. Here, none of the model outputs fall within 2σ of the D11/GISTEMP regression, and most (9 of 13) fall outside of 2σ from the SB11/HadCRUT3 regression.

Now I’m not nearly educated enough about to speculate on the cause of these gaps, or their relative importance, but D11 has confirmed to anyone with eyes that significant (>2σ) gaps exists.

I could be all interpreting this incorrectly, but the gaps are pretty clearly graphed, so my reading ability shouldn’t be an issue here.
Kevin McKinney says

7 Sep 2011 at 8:13 AM

Preview is, indeed working. Of course, one has to remember to actually look down and read it. [Mumbles, blushes, shuffles feet.]

#44–andrew30: “Did the Data show that the amount of energy leaving the system was greater that any of the computer simulation indicated?” Gavin: “No.”

I was glad to read that, as I’d read Dr. Spencer’s comments suggesting that that had been shown, yet I was unable to find a basis for that idea in SB ’11 (which I’ve read through twice, but–I must admit–I still understand incompletely.)

Specifically, what SB ’11 presented was not net radiation flux observations versus model results–that would clearly have been a basis for the ‘energy leaving the system’ claim. They presented rather the correlation of net flux with surface temperature changes in observations versus (some) climate models. It’s not clear to me that the model-observation differences imply a difference in net flux.

(I presume that S & B didn’t simply present the net flux data separately for each because it would have been obvious that it wasn’t ‘apples-to-apples,’ in that observations are analagous to a single model realization, not to an ensemble of model runs, which I presume is the case for the model data used. But if that’s correct, haven’t they simply camouflaged the problem by coming at it through the statistical back door, as it were? In other words, might the different lag-lead regressions reflect lesser variability in ensemble model runs, not climate sensitivity? Or perhaps that’s just restating the point about ENSO from a different perspective?)

#49–Yes, I’ve quoted that sentence a couple of times for the folks swallowing the “blows a hole in AGW” hype. In conjunction with the statements made by Dr. Spencer on his blog, it also becomes a pretty good illustration of the ‘end run’ strategy noted above.

#50–Martin, thanks for highlighting the seemingly non-random selection of climate models in SB ’11. Curious, that.
Martin Vermeer says

7 Sep 2011 at 9:50 AM

I see absolutely nothing wrong with SB11′s use of HadCRUT temperature data for their regressions

But Russ, doesn’t it make you wonder why they chose it? The paper doesn’t say. Where is your curiosity? :-)

Dessler clearly wonders: “… they plotted … the particular observational data set that provided maximum support for their hypothesis.”

BTW trying alternatives like this gives you one more handle on the real uncertainty of the observational curve, in addition to the formal sigmas. A due-diligence thing, and rather obvious.
Russ R. says

7 Sep 2011 at 11:06 AM

@ Martin Vermeer,

“But Russ, doesn’t it make you wonder why they chose it? The paper doesn’t say. Where is your curiosity?”

I would question a lot of other things before questioning the choice to use HadCRUT3 data, which in my novice opinion seems pretty uncontroversial.

If anything, I would probably have assumed that SB11 would have used UAH temperature data, given Spencer’s familiarity with it. (If I were to guess why they didn’t use UAH data, it could be because the satellite data are more sensitive to ENSO variations, recording higher peaks and lower troughs during ENSO events, so this may have been a reason to favour surface measurements over satellite measures for a time period were ENSO was a significant driver of variability.)

“Dessler clearly wonders: “… they plotted … the particular observational data set that provided maximum support for their hypothesis.”

And likewise, D11 plotted the data sets that provided the maximum support for his hypothesis. (Coincidentally, the GISTEMP regression diverges from the model outputs by more than the HadCRUT3 regression for lead times of 12-14 months.)

“BTW trying alternatives like this gives you one more handle on the real uncertainty of the observational curve, in addition to the formal sigmas. A due-diligence thing, and rather obvious.”

Overall, I agree with you on this… more relevant comparisons are better than fewer (provided they don’t create clutter).

I guess the big question I have is this: Since both SB11 and D11 show a couple of significant (>2σ) non-conformities between virtually all of the model outputs and the observed lead-lag relationship between radiative flux and temperature change, why is there so much discord over these (now confirmed) findings? Instead of trying to ignore the divergences, shouldn’t modelers instead be saying “Thanks for pointing out a shortcoming in our model. We’ll happily work to correct this, and in the process make our understanding of the physical processes more robust.”?

Who knows… the net impact of improving the models to reflect this relationship might end up being entirely trivial, or it might be signficant. But rather than arguing whether the models might need to be changed or not, why not update them regardless and see if it has an impact? You know… “a due-diligence thing”.
BillS says

7 Sep 2011 at 11:19 AM

Do all universities (US non-US) issue press releases? Wonder what the criteria might be for doing so?

Any thoughts on the practice of allowing authors to suggest a pool of reviewers?
Septic Matthew says

7 Sep 2011 at 11:57 AM

Russ R. and Martin Vermeer, you have a nice interchange.

After reading hyped claims and counter-claims for decades, and after reading Wagner’s letter in its entirety, I can’t see how his resignation accomplishes anything good. It would have been better if he had published, after 6 months or so, a rebuttal by Dessler or someone else; and then after while the counter-rebuttal by Spencer, and so forth, with an editorial note that further commentary ought to have something new.

SB11, D11 (as they are called) and their precursors and successors are important because clouds are important. They are trying to gain as much information as possible from extant observational data sets, but the data sets themselves have limited utility because they are observational rather than experimental. “Little utility” does not imply “no utility”. As with Mann et al analyses of proxy data, these analyses will stimulate other people to carry out more extensive data analyses using multiple methods and seeking relationships with other data in order to explore causal hypotheses as well as possible.

It is true that Spencer and friends exaggerated the importance of his paper; however, Wagner’s resignation makes it look like SB11 was REALLY IMPORTANT. It has added to the hype instead of quelling it.

There is hardly space to criticize everyone who made a mistake in this matter, but Ross McKittrick really ought to have kept quiet.
Sphaerica (Bob) says

7 Sep 2011 at 12:17 PM

56, Russ R,

None of your complaints in any way address the real issue, which is that Spencer tries to infer climate sensitivity from his study when that is not possible. Aa is pointed out by Dessler, all he is doing when he compares models to observations is to evaluate the ability of the models to accurately predict ENSO events, which is a known challenge that is being worked on and improving.

From Gavin’s post above:

And the ‘bottom-line’ implication by S&B that their reported discrepancy correlates with climate sensitivity is not even supported by their own figure 3. Their failure to acknowledge previous work on the role of ENSO in creating the TOA radiative changes they are examining (such as Trenberth et al, 2010 or Chung et al, 2010), likely led them to ignore the fact that it is the simulation of ENSO variability, not climate sensitivity, that determines how well the models match the S&B analysis (as clearly demonstrated in Trenberth and Fasullo’s guest post here last month).

Spencer’s method of using this study to evaluate climate sensitivity is very clearly flawed, and your own focus is misplaced.
Sphaerica (Bob) says

7 Sep 2011 at 12:23 PM

58, Septic Matthew,

Of course there’s every chance that Wagner did not resign to make some sort of political statement, but rather as a professional move to try to rescue the reputation of a young and aspiring journal from the coming storm, in an effort to prevent reputable scientists from turning their back on it as a possible avenue of publication for future papers.

And when I say “chance” what I mean is that that’s obviously what he did. I doubt he made this decision in a vacuum in an effort to be a climate-martyr-hero and just landed it on his superior’s desk one morning. He was a part of a business, a responsible part, and they needed to make moves to support their business. He was the price that Remote Sensing had to pay for a business mistake.

The denial need to see everything in the political perspective of the “climate debate” is warping this out of perspective. The correct perspective is that of a business running a journal, not of a maverick science-editor gone rogue to bring down the mighty forces of light who are fighting for truth and denial-reason against the dark-evil-scientist-cabal.
steven mosher says

7 Sep 2011 at 12:26 PM

Martin:

“BTW trying alternatives like this gives you one more handle on the real uncertainty of the observational curve, in addition to the formal sigmas. A due-diligence thing, and rather obvious.”

Thank you. Over the course of the past four years quite a number of us have suggested just this type of due-diligence thing repeatedly. These suggestions which seem utterly normal to anyone who has had to work with messy datasets, conflicting datasets, and divergent models, have been routinely met with cat calls, insults, and challenges to “do your own damn science.” I find it heartening that you publicly endorse the approach. This would be a good thing for reviewers to request. What’s the sensitivity of your results to your data selection decisions? And of course you’ll have no objection to other’s asking these fundamental due diligence questions

[Response: Mosher: There has never been any objection to the idea of ‘due diligence’ at RealClimate. The objection has been to the laughable and arrogant claim — repeated ad nauseum by you — that the idea of ‘diligence’ hasn’t occurred to anyone before, and to the offensive and unsubstantiated accusation that the mainstream scientific community have placed scientific diligence secondary to a perceived political agenda by the mainstream scientific community.–eric]
dhogaza says

7 Sep 2011 at 12:38 PM

Septic Matthew:

After reading hyped claims and counter-claims for decades, and after reading Wagner’s letter in its entirety, I can’t see how his resignation accomplishes anything good.

Perhaps rather than trying to “do good” he was simply preserving his personal honor and reputation. Such things count to some people. To those smearing him … not so much.
MapleLeaf says

7 Sep 2011 at 12:58 PM

Time to play whack-a-mole again. Roy responds. He really ought to have thought and reflected more before doing so…
Kevin McKinney says

7 Sep 2011 at 12:59 PM

“But rather than arguing whether the models might need to be changed or not, why not update them regardless and see if it has an impact?”

They are basically always undergoing updating, as I understand it. And of course newer models are always incorporating more and more aspects of the real world, as the growth of computing power allows. A nice graphic illustrating the latter process can be found as Figure 3 of the following review article:

http://onlinelibrary.wiley.com/doi/10.1002/wcc.95/full

In short, your quoted question is pretty much moot–nobody is arguing against improving climate models. They’re just going ahead and doing it.
Septic Matthew says

7 Sep 2011 at 1:13 PM

Sphaerica(Bob): The correct perspective is that of a business running a journal, not of a maverick science-editor gone rogue to bring down the mighty forces of light who are fighting for truth and denial-reason against the dark-evil-scientist-cabal.

It’s purely a business decision? I don’t know for sure, but 56,000 downloads speaks to a business success, and a mere publication of a rebuttal would have achieved more business success than Wagner’s resignation.

Dessler’s rebuttal shows that SB11 would have been a better paper had S&B presented results based on more than 1 temperature series, and had S&B presented all model results instead of a selection. D11 is not itself beyond reproach, as it misquoted the main conclusion of SB11. SB11 does not refute D10, but merely shows that the analysis of lagged correlations produces a slightly higher squared correlation while reversing the direction of the cloud-temperature link. As a business proposition, I don’t see how the business of the journal would be hurt by continuing to publish a long series of interchanges on this topic — extending on to Granger causality, intervening variables like humidity (if measured), cosmic rays (if measured), nonlinear nonstationary vector autoregressive models, etc. The journal could have produced a high “impact factor” in the journal ratings.

Not everyone agrees with your conjecture that it is purely a business decision.

61, dhogaza: Perhaps rather than trying to “do good” he was simply preserving his personal honor and reputation.

I don’t think anyone believes that it is dishonorable for an editor to publish a paper that is too good to retract, that he later came to believe was incorrect. That his letter says the decision was not [purely political] obscures the motivational analysis.
Dan H. says

7 Sep 2011 at 1:14 PM

I tend to agree with septic on this one. Neither paper is very important, except that clouds are important to climate researcherd. That said, neither paper presenting any new insight into cloud behavier. The resignation only seemed to focus attention on the papers which are rather benign in their conclusions.

Reminds of the lyrics to the song Signs, “hurray for our side.”
Martin Vermeer says

7 Sep 2011 at 2:51 PM

Moshpit teaching granny to suck eggs (Gavin what’s that in Yiddish?)

[Response: חוצפּה – gavin]
Sphaerica (Bob) says

7 Sep 2011 at 2:57 PM

65, Septic Matthew,

Except the rebuttal was published in another journal. The point of the resignation isn’t the quality of the paper, it’s the future of the journal. The editor’s job isn’t to make sure that the science is done right next time, it’s to make sure that his journal remains a part of that process.

Not everyone agrees with your conjecture that it is purely a business decision.

Please note that “not everyone” in this sentence refers, as far as I can tell, exclusively to climate contrarians. They are the only ones perceiving this as an outlandish political statement.
Sphaerica (Bob) says

7 Sep 2011 at 3:03 PM

66, Dan H.

The papers would not have been all that important, if the UAH press release, Fox News and Forbes didn’t hype it into something it wasn’t (i.e. the final nail in the entire climate change coffin, overturning all of climate science) then yes, this would have been a non-issue, with no resignation involved.

You’ll note that this was a major point in Wagner’s resignation letter, i.e. the behavior of the scientists and media circus surrounding the publication of the original paper. They backed him into that corner by making it into something it shouldn’t have been, i.e. treated as a landmark paper before it was even seen by other climate scientists.
Øystein says

7 Sep 2011 at 3:29 PM

Steven Mosher has stated somewhere (as he is completely uninteresting, I can’t be bothered to look it up) that he is out to provoke. The proper response to him doing his provoking here, then, would be to let it stand.

And then just.. ignore. I mean, stupid posts come on blogs all the time. Nothing distinguishes Mosher’s stupidity from the rest, apart from his admission to being quarrelsome.
chris says

7 Sep 2011 at 3:30 PM

Septic and DanH,

you guys are rather missing the point. Spencer’s paper is a (likely knowingly) fundamentally flawed effort. Its interpretations are simply not supported by the broader evidence base. In the general course of things it would simply be ignored. However it has been trumpeted through the blogosphere and in the media as an apparent counter to the science that informs us about climate response to radiative forcing and the role of clouds.

In that light, Wagner’s resignation and Dessler’s paper are valuable. Of course Wagner’s resignation is an entirely personal matter and it’s secondary whether it’s consequences are “good” or “bad” – Wagner simply did what he felt was right. However the consequences are positive to my mind. A paper of little worth has been overblown to serve non-science agendas; rather than simply let this pass, the two aspects of Spencer’s bad faith are highlighted – the bad faith abuse of the publishing process has been highlighted by Wagner’s resignation; the bad faith abuse of standard scientific practice (cherry picking particular models to support a preconceived point of view; fundamentally misplacing the relative roles of ENSO and climate sensitivity in model/empirical comparisons; flawed/absent statistical analysis etc) is highlighted by Dessler’s paper.

That’s all useful. There isn’t much new science…but we’re a good bit clearer about (i) the state of the science that bears on climate response to forcing/ENSO/cloud feedbacks and (ii) the nature of efforts to pursue non-science agendas.
trrll says

7 Sep 2011 at 3:37 PM

Looking at Dr. Spencer’s initial response to Dessler, he notes that he finds it “pretty clever” to apply his lag correlation analysis to the output of GCM models, which do not include a mechanism for clouds to provide surface temperature changes. I was surprised by this comment, because it was a test that immediately occurred to me when I read his paper, in which he seems to be arguing that such lag correlations are diagnostic of the existence of such a mechanism. This reinforces my opinion that Dr. Spencer was poorly served by the reviewers of his original submission. I’m not even a climate scientist, but if I had been asked to review his manuscript, this is a test that I would have insisted upon.

I don’t know if the poor reviewing was in any way Spencer’s fault, but I’ll note that it can be tempting to suggest reviewers whom you think will “go easy” on your submission. This is almost always a mistake; it is better to have a reviewer express harsh criticisms to you before publication, so that you can fix or preemptively rebut them in the text, than to have your critics saying similar things to one another after publication.

Spencer at this point still does not seem to quite grasp the significance of what Dessler has done. He comments “But look at what Dessler has done: he has used models which DO NOT ALLOW cloud changes to affect temperature, in order to support his case that cloud changes do not affect temperature! While I will have to think about this some more, it smacks of circular reasoning. ”

But of course, that is not what Dessler is arguing–he is arguing that lag correlations coefficients such as Spencer reported do not constitute compelling evidence for a mechanistic link between clouds and surface heating. There is nothing circular about this. Spencer might still be able to argue that in this particular case, the pattern of correlation coefficients is the result of such a link (Dessler proved that it doesn’t have to be; he didn’t prove that it can’t be), but that will require a more sophisticated argument on Spencer’s part–which if he’d gotten a proper review, he would have known in the first place that he needed to provide.

I should note that I do not disagree with the suggestions that I’ve seen that the paper should have been withdrawn by the journal due to the failure of review. Even if the editors now believe that the paper is wrong, that would be inappropriate. Even though it is true that publications such as this can be used for political purposes in an unfortunate way (and even though Spencer himself appears to have encouraged this), it is not scientific malfeasance to be wrong in a publication, and erroneous conclusions slip past review all the time. Besides, I can think of a number of individuals in various fields who have enhanced the progress of science substantially despite being consistently wrong about nearly everything important, simply because their errors stimulated others to clarify their own thinking and carry out important experiments and analyses.
Septic Matthew says

7 Sep 2011 at 3:40 PM

68, Sphaerica(Bob): Please note that “not everyone” in this sentence refers, as far as I can tell, exclusively to climate contrarians.

Dhogaza wrote that it was a matter of honor.
steven mosher says

7 Sep 2011 at 3:41 PM

Martin,

I’m just thankful that we finally agree on something. That basic due diligence requires these types of checks. I think if we spent more time focusing on our agreements rather than our disagreements that we might make some progress. I’ll note, as I did before, that I find SB11 most unconvincing. In particular WRT the way they only showed the results from a few models. I think this was misleading. I know for some that this tactic played on the assumption that other models would fall in between the high sensitivity models and the low ones. It was also interesting to note that the models that matched best (D11) had sensitivities around 3.4, although I’m not at all clear that this has anything to do with the ECR. Your thoughts? Personally, i’d like to see more discussion about using data to do System Indentification.
Septic Matthew says

7 Sep 2011 at 3:43 PM

71, chris: Its interpretations are simply not supported by the broader evidence base.

That’s one way to phrase it.
steven mosher says

7 Sep 2011 at 3:48 PM

Oystein.

I think that the best ideas do survive a good argument.
And you will note, that Martin and I have come to agreement. Basic due diligence requires the kinds of checks he suggests. Forget, if you must, that people you don’t happen to like have uttered these same words. Forget that people argued with the common sense notion. be happy, that we agree. Basic due diligence requires these kinds of checks. If you disagree with that, now would be the time to make your argument. And this would be the place. And martin will tell you why you are wrong. I promise not to say a thing, when you disagree with me and he corrects you.
Doug Bostrom says

7 Sep 2011 at 3:56 PM

…56,000 downloads speaks to a business success…

In terms of notoriety with a certain demographic, sure, free downloads contributing no revenue, no gain in reputation where it counts and a subsequent embarrassing exposition of grave flaws are undeniably successful customer relations in the Biblical sense. Question is, who were the customers and who came out on top? Where was success to be found?
Bill Hunter says

7 Sep 2011 at 4:00 PM

Resigning for incompetence! What a unique concept!

Lets face it self immolation is a practice only conducted for personal betterment (e.g. better than being fired, or for personal gain – you were induced) or as a political statement.

We can probably presume it wasn’t a resign or get fired choice.

[Response: The old ‘argument from personal incredulity’ again. Just because you can’t imagine ever doing something honorable, it doesn’t mean it never happens. – gavin]
dhogaza says

7 Sep 2011 at 4:10 PM

Septic Matthew:

Dhogaza wrote that it was a matter of honor.

Are you allergic to the word “perhaps” that goes along with that quote?
One Anonymous Bloke says

7 Sep 2011 at 4:25 PM

The deniers’ false narrative (that mainstream climatology is political) forces them into this increasingly ludicrous position. On the one hand we have Professor Wagner, articulate and open about the reasons for his decision to step aside, and on the other we have the paranoid denier narrative, seeing conspiracy everywhere, IPCC gatekeepers hiding under the bed, dogs and cats living together…
The basis for their allegations? Well, nothing actually, not a skerrik of evidence, just assertion piled upon imagination piled upon assumption.
Russ R. says

7 Sep 2011 at 4:40 PM

@Sphaerica (Bob)

Hi Bob,

“None of your complaints in any way address the real issue, which is that Spencer tries to infer climate sensitivity from his study when that is not possible.”

This conclusion is a bit of a problem for me. Is it really “not possible” to assess climate sensitivity from observational data?

“Aa is pointed out by Dessler, all he is doing when he compares models to observations is to evaluate the ability of the models to accurately predict ENSO events, which is a known challenge that is being worked on and improving.”

Looking at a 10-year period of observations, particularly during a stretch of time where GHG forcings should have been steadily increasing (in conjunction with all of the associated feedbacks), how is it possible that the only material temperature signal is attributable exclusively to ENSO events?

After an good amount of time and effort spent pondering this question, I can only come up with a handful of potential explanations:

1. The modelled direct GHG forcing is overestimated. (Included only for the sake of completeness, since this would be effectively impossible.)

2. The modelled feedbacks are overestimated. (As Spencer and others would have us believe.)

3. The ENSO forcing is much larger than the atmospheric forcing and feedback signals, and as such they’re being completely washed out. (Which I believe is Dessler’s position, but if this is the case and the radiative forcings and feedbacks are indeed so small that they’re indistinguishable from noise, what’s the cause for concern?).

4. There were some other cooling forcings or feedbacks at play which were not captured in the models but were masking the GHG warming signal. (Often speculated, but with no solid supporting evidence as yet.)

5. The 10-year period being studied was some sort of ENSO outlier, and the modeled relationship would hold up under normal periods. (Hard to demonstrate this)

6. 10-years is insufficient to draw conclusions, and a larger sample size is required to see meaningful relationships. (As Santer et al. recently wrote.)

7. The SB11 methodology of regression analysis is fundamentally mis-specified for the task. (But then what is the correct methodology? Is it even possible to evaluate? And if not, then what does that say about the falsifiability of the models?)

(My sincere apologies if the above list is incomplete, incorrect, or imprecise… it’s my own novice interpretation, and I’m open to corrections.)
Bill Hunter says

7 Sep 2011 at 5:17 PM

“Just because you can’t imagine ever doing something honorable, it doesn’t mean it never happens. – gavin”

I don’t think thats what I said. Accepting responsibility is an honorable thing to do.

But where do you draw the line? Putting a bullet in your head?

What is not being described here is the damage that cannot be corrected by any other means that suggests its actually sane to go beyond simply accepting responsibility.
dhogaza says

7 Sep 2011 at 5:32 PM

Russ R:

“None of your complaints in any way address the real issue, which is that Spencer tries to infer climate sensitivity from his study when that is not possible.”

This conclusion is a bit of a problem for me. Is it really “not possible” to assess climate sensitivity from observational data?

Russ, surely you can understand that the limited data used by Roy in his study doesn’t equate to “observational data” in the universal sense.
Russell says

7 Sep 2011 at 6:28 PM

If Remote Sensing charged the same as Climatic Change, the downloads would have cost Heartland’s torch and pitchfork chorus $2,352,000
Dan H. says

7 Sep 2011 at 6:44 PM

I have always had a problem with researchers using that data which best supports their premise. Unfortunately, it is not restricted to Spencer. A rebuttal in another journal does the same thing using different data.
Big Dave says

7 Sep 2011 at 7:08 PM

One Anonymous Bloke — 7 Sep 2011 @ 4:25 PM

“The deniers’ false narrative (that mainstream climatology is political) forces them into this increasingly ludicrous position. On the one hand we have Professor Wagner, articulate and open about the reasons for his decision to step aside, and on the other we have the paranoid denier narrative, seeing conspiracy everywhere, IPCC gatekeepers hiding under the bed, dogs and cats living together…
The basis for their allegations? Well, nothing actually, not a skerrik of evidence, just assertion piled upon imagination piled upon assumption.”

Bloke
Do you have any thoughts about why Wagner would single out Trenberth for his “I’m sorry” note? It seemed very odd to me.
Cheers
Big Dave

[Response: And thus a new conspiracy theory is born. I mean, really, of all the things that one might worry about – why the Forbes piece was so bad, why the UAH press release went so overboard, what Spencer’s correlations actually signify, – you think the oddest thing is an email which you have no direct knowledge of, between two people you don’t know (and at least one of which you had never heard of a week ago) expressing sentiments which depend entirely on the (again unknown) context to be interpreted. While this might be fascinating to you, it is almost the definition of untethered pointless triviality to me. Please take it somewhere else. – gavin]
Septic Matthew says

7 Sep 2011 at 7:51 PM

80, Russ R.

Your numbers 4 and 7 are related. If there are other important variables (intermediate steps to account for the lag, or much evidence that there is no lag), then the simple correlational methods of SB11 and D10 and D11 are inadequate. That the time series is short and possibly unrepresentative of most decades are also problems.

It would be strange to prefer the model with a lower squared correlation coefficient (D10 compared to SB11), but SB11 would acquire more credibility if they knew, and modeled, the mechanism producing the lag effect. I think that their paper will inspire more searches for such a mechanism. Especially if their result holds up through 2030 data.
Sphaerica (Bob) says

7 Sep 2011 at 8:01 PM

80, Russ R,

Is it really “not possible” to assess climate sensitivity from observational data?

You tell me. Climate scientists regularly say you need at least 30 years to detect a trend. At just under 15 years Phil Jones said that warming for that period was not yet statistically significant. It reached the point of statistical significance just at 15 years, but only to detect the warming, nothing more.

There are probably two or three dozen papers that try to estimate climate sensitivity, all written over the course of the past two decades, and even more attempts to infer it from combinations of such papers.

Do you really think that someone could be so brilliant as to tease out climate sensitivity from just ten years of observations, using a simple single equation box model?

Beyond this, the month to month and annual variations caused by ENSO, as well as other noise in the system, vastly masks the actual signal (warming). How could one possibly ever infer climate sensitivity from such a situation? It would be rather like successfully inferring the entire content of the U.S. Constitution by looking only at every 53rd word of the text.

[Response: The question depends entirely on context. “Observational data” is a very large amount of data indeed, and I’m sure that there are constraints on climate sensitivity to be found within it. However, looking at short term trends and variability in global average quantities are not likely to be that useful. Nor is the seasonal cycle, or (on it’s own) the response to ENSO, nor trends over the 20th C (because of the uncertainties in the forcings). I’m far more impressed by looking at the paleo-data for good observational constraints (i.e. Kohler et al, 2009; Annan and Hargreaves, 2006; etc.) since we have good candidate periods that were close to radiative equilibrium and for which many of the drivers can be quantified. As an aside, Spencer bizarrely thinks that the prior statement implies I am ignoring “many years of detailed global satellite observations of today’s climate system” and “giving science a bad name” (something in which I will accept he has more experience). – gavin]
Ray Ladbury says

7 Sep 2011 at 9:22 PM

Russ R., Let me see if I can come up with a cosmological argument:

The electromagnetic force is far stronger than the gravitational force, and yet cosmology is dominated by gravitation. There must be something critically wrong with General Relativity, right? Woohoo! Let’s go slide down a black hole. Oh, wait. Positive and negative charges balance each other over large distances, so the net electromagnetic field over large distances is zero.

Likewise, the El Nino Southern OSCILLATION doesn’t dominate climate on a long timescale precisely because it is an OSCILLATION. It goes up and down. Greenhouse forcing goes up and up. Is this really so hard to grasp?
Pete Dunkelberg says

7 Sep 2011 at 11:51 PM

“Is this really so hard to grasp?

You know the answer, don’t you?
dhogaza says

8 Sep 2011 at 12:15 AM

Dan H:

I have always had a problem with researchers using that data which best supports their premise. Unfortunately, it is not restricted to Spencer. A rebuttal in another journal does the same thing using different data.

No, it does not. Spencer says “we show that models (all of them, in general) don’t match the observational data”. The rebuttal shows that Spencer only discusses those that are outliers regarding computed sensitivity and that don’t do a good job of generating ENSO-like events in support. Spencer looked at 14 models and threw out those that don’t meet his “models suck” pre-determined conclusion.

What Dressler does is show that the models that more realistically generate ENSO-like events, which (probably not coincidently) also generate mid-range sensitivity numbers, actually do a pretty good job of matching Spencers observational data.

If you don’t see the difference, you’re simply refusing to open your eyes.

Of course, I could be wrong, perhaps you meant that Dressler’s premise is that those models that best model ENSO events will more closely match observations taken during ENSO events. But this premise isn’t at all like Spencer’s cherry picking, so I’m sure you didn’t mean it. Particularly given your track record …
Martin Vermeer says

8 Sep 2011 at 12:57 AM

Mosher, the only thing I can see wrong in Øystein’s comment is his estimation that you are stupid. My description of you would contain colourful language — which civility bids me not to elaborate on — but it would not contain the word stupid.

Sometimes you even say the truth, or something close to it; stuff happens. I’ll try not to hold that against the truth ;-)
Martin Vermeer says

8 Sep 2011 at 1:12 AM

Do you have any thoughts about why Wagner would single out Trenberth for his “I’m sorry” note? It seemed very odd to me.

Not at all… Wagner made the apology that Spencer should have made. Elaboration here.

BTW you know what I find odd? How come the faceless powers that forced Wagner to resign, and then to remain quiet about being forced to resign, and to write an elaborate note with apparently made-up reasons, didn’t manage to force a simple retraction of the paper? There’s a mystery for you.
One Anonymous Bloke says

8 Sep 2011 at 5:34 AM

Big Dave #84 – Do I have any thoughts? Yes, I think that you need to watch out for that confirmation bias, and pay more attention to the evidence, rather than anecdotes or advocacy.
Ray Ladbury says

8 Sep 2011 at 9:06 AM

I think that it is revealing that the denialist camp view climate science as having two sides. It doesn’t. In climate science as in all science, the only thing that matters is whether your theories, viewpoints and ideas help you better understand the subject you are researching. Spencer’s work is rejected not because it donesn’t fit into the climate orthodoxy or even because it is incorrect, but rather because it leads nowhere. We do not emerge from reading Spencer with a better understanding or anythning. Even if Spencer were 100% correct and proper, the question it would raise would be how we understand his results in light of the fact that all other lines of evidence suggest a much higher climate sensitivity and feedback.

Published as it was, SB11 was more about providing political cover than about trying to advance understanding. Viewed in this light, Wagner’s resignation makes perfect sense.
Russ R. says

8 Sep 2011 at 9:07 AM

Thanks for all the engaging responses. I’ll try to address them individually.

#82 dhogaza
“Russ, surely you can understand that the limited data used by Roy in his study doesn’t equate to “observational data” in the universal sense.”

That depends on what you mean by “limited data”. If you mean insufficent sample size, then that’s item #6 on my list (i.e. 10 years isn’t long enough).

If however, you’re suggesting the observational data Spencer is using isn’t relevant or applicable to what he’s trying to deduce, then that’s captured in my list as item #7 (i.e. the methodology may mis-specified for the task), in which case no amount of additional CERES data will shed any more light on clouds’ impact on the earth’s radiation budget and climate sensitivity. This would be quite disappointing given the amount invested in the project.

#84 Septic Matthew

“SB11 would acquire more credibility if they knew, and modeled, the mechanism producing the lag effect. “ Agree, but this is a next step… well outside the scope of the paper, which merely identifies a stronger lag relationship than is currently reflected in most models. Once everyone agrees on the existence and approximate size of the gap (and D11 would confirm that there is a significant gap, albeit narrower than suggested by SB11), then people can turn their attention to developing and testing hypothesis for the mechanisms that cause the gap. (The work currently going on at CERN is an example of this… it may show evidence of cloud forcing that might fill some or all of the gap… or it might not.)

#86 Sphaerica (Bob)
“Do you really think that someone could be so brilliant as to tease out climate sensitivity from just ten years of observations, using a simple single equation box model?”
The “ten years” issue was also in my list as #6, but I don’t entirely agree with your objection to a “simple single equation box model”. The equation being used (i.e. “change in temperature * heat capacity = net energy transfer”) is pretty fundamental for a closed system over a period of time.

#86 Gavin (in Response)
” I’m far more impressed by looking at the paleo-data for good observational constraints (i.e. Kohler et al, 2009; Annan and Hargreaves, 2006; etc.) since we have good candidate periods that were close to radiative equilibrium and for which many of the drivers can be quantified”

I might be completely misunderstand you here, but when there are multiple variables simultaneously acting on a system in equilibrium, it becomes almost impossible to identify the sensitivity to any one variable. (An analogy from economics is the price and quantity relationship in response to changes in supply and demand.) Rather than observing periods of equilibrium, wouldn’t the most valuable observations be instances where one variable changes materially and quantifiably, while all others are effectively held constant, so that one can observe the temporary disequilibrium and subsequent shift to a new equilibrium state (e.g. volcanic events)?

#87 Ray Ladbury

I quite like your cosmological analogy. If the same characteristics apply in this case, then hopefully more years of CERES data will help differentiate the cumulative effects of GHG forcing from the cyclical effects of ENSO events. This was basically item #6 on my list.
Øystein says

8 Sep 2011 at 9:59 AM

Martin, just to clarify:

I don’t think Mosher is stupid. That claim would take a lot of evidence. His post here, and other posts he has made, however, are stupid. And deserve no attention.

Which means I just wasted another minute of my life… oh, the irony
Pete Dunkelberg says

8 Sep 2011 at 10:09 AM

Re both Spencer and Lindzen, don’t neglect the conservation of energy problem – see Rabett.
Ray Ladbury says

8 Sep 2011 at 11:31 AM

Russ R.,
More years of CERES data? First, we’ve got ~40 years of data that show behavior that is strikingly consistent with what we expect from a greenhouse mechanism–and inconsistent with any other explanation. Second, CERES is not even the right tool for the job! Ideally, we ought to have an entire array of satellites looking at both incoming solar radiation (a la the upcoming Total Solar Irradiance Sensor–TSIS) AND measuring outgoing radiation over the entire globe. And we would need about 30 years of it–because the climate, fundamentally, is not a one-box model. You need at least 2 boxes to capture the effects of the oceans or very long time-series to average out their effects.

None of this is at all controversial or new.
Septic Matthew says

8 Sep 2011 at 11:46 AM

94, Russ R: Rather than observing periods of equilibrium, wouldn’t the most valuable observations be instances where one variable changes materially and quantifiably, while all others are effectively held constant, so that one can observe the temporary disequilibrium and subsequent shift to a new equilibrium state (e.g. volcanic events)?

That’s why they do experiments in industrial process control. I’m guessing you knew that. In climate science, it is difficult to assure the condition of ceteris paribas. I’m guessing you knew that also.

I hope you enjoy your time here at RealClimate, and return often.