RealClimate logo

Science is self-correcting: Lessons from the arsenic controversy

Filed under: — group @ 29 December 2010

Recent attention to NASA’s announcement of ‘arsenic-based life’ has provided a very public window into how science and scientists operate. Debate surrounds the announcement of any controversial scientific finding. In the case of arseno-DNA, the discussion that is playing out on the blogs is very similar to the process that usually plays out in conferences and seminars. This discussion is a core process by which science works.
More »

Post-holiday round-up

Filed under: — group @ 28 December 2010

What with holiday travel, and various other commitments, we’ve missed a few interesting stories over the last week or so.

First off, AGU has posted highlights from this year’s meeting – mainly the keynote lectures, and there are a few interesting presentations for instance from Tim Palmer on how to move climate modelling forward, Ellen Mosley-Thompson on the ice records, and David Hodell on abrupt climate change during the last deglaciation. (We should really have a ‘videos’ page where we can post these links more permanently – all suggestions for other videos to be placed there can be made in the comments).

More relevant for scientist readers might be Michael Oppenheimer’s talk on the science/policy interface and what scientists can usefully do, in the first Stephen Schneider Lecture. There was a wealth of coverage on AGU in general, and for those with patience, looking through the twitter feeds with #agu10 shows up a lot of interesting commentary from both scientists and journalists. Skeptical Science and Steve Easterbrook also have good round ups. [edited]

Second, there was a great front page piece in the New York Times by Justin Gillis on the Keeling curve – and the role that Dave Keeling’s son, Ralph, is playing in continuing his father’s groundbreaking work. Gillis had a few follow-up blogs that are also worth reading. We spend a lot of time criticising media descriptions on climate change, so it’s quite pleasing to be praising a high profile story instead.

Finally, something new. Miloslav Nic has put together a beta version of an interactive guide to IPCC AR4, with clickable references, cited author (for instance, all the Schneiders) and journal searches. This should be a very useful resource and hopefully something IPCC can adopt for themselves in the next report.

Back to normal posting soon….

Cold winter in a world of warming?

Filed under: — rasmus @ 14 December 2010

Last June, during the International Polar Year conference, James Overland suggested that there are more cold and snowy winters to come. He argued that the exceptionally cold snowy 2009-2010 winter in Europe had a connection with the loss of sea-ice in the Arctic. The cold winters were associated with a persistent ‘blocking event’, bringing in cold air over Europe from the north and the east.

More »

Responses to McShane and Wyner

Filed under: — group @ 13 December 2010

Gavin Schmidt and Michael Mann

Readers may recall a flurry of excitement in the blogosphere concerning the McShane and Wyner paper in August. Well, the discussions on the McShane and Wyner paper in AOAS have now been put online. There are a stunning 13 different discussion pieces, an editorial and a rebuttal. The invited discussions and rebuttal were basically published ‘as is’, with simple editorial review, rather than proper external peer review. This is a relatively unusual way of doing things in our experience, but it does seem to have been effective at getting rapid responses with a wide variety of perspectives, though without peer review, a large number of unjustified, unsupportable and irrelevant statements have also got through.

A few of these discussions were already online, i.e. from Martin Tingley, Schmidt, Mann and Rutherford (SMR), and one from Smerdon. Others, including contributions from Nychka & Li, Wahl & Ammann, McIntyre & McKitrick, Smith, Berliner and Rougier are newly available on the AOAS site and we have not yet read these as carefully yet.

Inevitably, focus in the discussions is on problems with MW, but it is worth stating upfront here (as is also stated in a number of the papers) that MW made positive contributions to the discussion as well – they introduced a number of new methods (and provided code that allows everyone to try them out), and their use of the Monte Carlo/Markov Chain (MCMC) Bayesian approach to assess uncertainties in the reconstructions is certainly interesting. This does not excuse their rather poor framing of the issues, and the multiple errors they made in describing previous work, but it does make the discussions somewhat more interesting than a simple error correcting exercise might have been. MW are also to be commended on actually following through on publishing a reconstruction and its uncertainties, rather than simply pointing to potential issues and never working through the implications.

The discussions raise some serious general issues with MW’s work – with respect to how they use the data, the methodologies they introduce (specifically the ‘Lasso’ method), the conclusions they draw, whether there are objective methods to decide whether one method of reconstruction is better than another and whether the Bayesian approach outlined in the last part of the paper is really what it is claimed. But there are also a couple of very specific issues to the MW analysis; for instance, the claim that MW used the same data as Mann et al, 2008 (henceforth M08).

On that specific issue, presumably just an oversight, MW apparently used the “Start Year” column in the M08 spreadsheet instead of the “Start Year (for recon)” column. The difference between the two is related to the fact that many tree ring reconstructions only have a small number of trees in their earliest periods and that greatly inflates their uncertainty (and therefore reduces their utility). To reduce the impact of this problem, M08 only used tree ring records when they had at least 8 individual trees, which left 59 series in the 1000 AD frozen network. The fact that there were only 59 series in the AD 1000 network of M08 was stated clearly in the paper, and the criterion regarding the minimal number of trees (8) was described in the Supplementary Information. The difference in results between the correct M08 network and spurious 95 record network MW actually used is unfortunately quite significant. Using the correct data substantially reduces the estimates of peak medieval warmth shown by MW (as well as reducing the apparent spread among the reconstructions). This is even more true when the frequently challenged “Tiljander” series are removed, leaving a network of 55 series. In their rebuttal, MW claim that M08 quality control is simply an ‘ad hoc’ filtering and deny that they made a mistake at all. This is not really credible, and it would have done them much credit to simply accept this criticism.

With just this correction, applying MW’s own procedures yields strong conclusions regarding how anomalous recent warmth is the longer-term context. MW found recent warmth to be unusual in a long-term context: they estimated an 80% likelihood that the decade 1997-2006 was warmer than any other for at least the past 1000 years. Using the more appropriate 55-proxy dataset with the same estimation procedure (which involved retaining K=10 PCs of the proxy data), yields a higher probability of 84% that recent decadal warmth is unprecedented for the past millennium.

However K=10 principal components is almost certainly too large, and the resulting reconstruction likely suffers from statistical over-fitting. Objective selection criteria applied to the M08 AD 1000 proxy network as well as independent “pseudoproxy” analyses (discussed below) favor retaining only K=4 PCs. (Note that MW correctly point out that SMR made an error in calculating this, but correct application of the Wilks (2006) method fortunately does not change the result, 4 PCs should be retained in each case). Nonetheless, this choice yields a very close match with the relevant M08 reconstruction. It also yields considerably higher probabilities up to 99% that recent decadal warmth is unprecedented for at least the past millennium. These posterior probabilities imply substantially higher confidence than the “likely” assessment by M08 and IPCC (2007) (a 67% level of confidence). Indeed, a probability of 99% not only exceeds the IPCC “very likely” threshold (90%), but reaches the “virtually certain” (99%) threshold. In this sense, the MW analysis, using the proper proxy data and proper methodological choices, yields inferences regarding the unusual nature of recent warmth that are even more confident than expressed in past work.

An important real issue is whether proxy data provides more information than naive models (such as the mean of the calibrating data for instance) or outperform random noise of various types. This is something that has been addressed in many previous studies which have come to very different different conclusions than MW, and so the reasons why MW came to their conclusion is worth investigating. Two factors appear to be important – their use of the “Lasso” method exclusively to assess this, and the use of short holdout periods (30 years) for both extrapolated and interpolated validation periods.

So how do you assess how good a method is? This is addressed in almost half of the discussion papers – Tingley in particular gives strong evidence that Lasso is not in fact a very suitable method, and is outperformed by his Composite Regression method in test cases, Kaplan points out that using noise with significant long term trends will also perform well in interpolation. Both Smith and the paper by Craigmile and Rajaratnam also address this point.

In our submission, we tested all of the MW methods in “pseudoproxy” experiments based on long climate simulations (a standard benchmark used by practitioners in the field). Again, Lasso was outperformed by almost every other method, especially the EIV method used in M08, but even in comparison with the other methods MW introduced. The only support for ‘Lasso’ comes from McIntyre and McKitrick who curiously claim that the main criteria in choosing a method should be how long it has been used in other contexts, regardless of how poorly it performs in practice for a specific new application. A very odd criteria indeed, which if followed would lead to the complete cessation of any innovation in statistical approaches.

The MW rebuttal focuses a lot on SMR and we will take the time to look into the specifics more closely, but some of their criticism is simply bogus. They claim our supplemental code was not usable, but in fact we provided a turnkey R script for every single figure in our submission – something not true of their code, so that is a little cheeky of them [as is declaring that one of us to be a mere blogger, rather than a climate scientist ;-) ]. They make a great deal of the fact that we only plotted the ~50 year smoothed data rather than the annual means. But this seems to be more a function of their misconstruing what these reconstructions are for (or are capable of) rather than a real issue. Not least of which, the smoothing allows the curves and methods to be more easily distinguished – it is not a ‘correction’ to plot noisy annual data in order to obscure the differences in results!

Additionally, MW make an egregiously wrong claim about centering in our calculations. All the PC calculations use prcomp(proxy, center=TRUE, scale=TRUE) to specifically deal with that, while the plots use a constant baseline of 1900-1980 for consistency. They confuse plotting convention with a calculation.

There is a great deal to digest in these discussions, and so we would like to open the discussion here to all of the authors to give their thoughts on how it all stacks up, what can be taken forward, and how such interactions might be better managed in future. For instance, we are somewhat hesitant to support non-peer reviewed contributions (even our own) in the literature, but perhaps others can make a case for it.

In summary, there is much sense in these contributions, and Berliner’s last paragraph sums this up nicely:

The problem of anthropogenic climate change cannot be settled by a purely statistical argument. We can have no controlled experiment with a series of exchangeable Earths randomly assigned to various forcing levels to enable traditional statistical studies of causation. (The use of large-scale climate system models can be viewed as a surrogate, though we need to better assess this.) Rather, the issue involves the combination of statistical analyses and, rather than versus, climate science.

Hear, hear.

Feedback on Cloud Feedback

Filed under: — group @ 9 December 2010

Guest article by Andrew Dessler

I have a paper in this week’s issue of Science on the cloud feedback that may be of interest to realclimate readers. As you may know, clouds are important regulators of the amount of energy in and out of the climate system. Clouds both reflect sunlight back to space and trap infrared radiation and keep it from escaping to space. Changes in clouds can therefore have profound impacts on our climate.

A positive cloud feedback loop posits a scenario whereby an initial warming of the planet, caused, for example, by increases in greenhouse gases, causes clouds to trap more energy and lead to further warming. Such a process amplifies the direct heating by greenhouse gases. Models have been long predicted this, but testing the models has proved difficult.

Making the issue even more contentious, some of the more credible skeptics out there (e.g., Lindzen, Spencer) have been arguing that clouds behave quite differently from that predicted by models. In fact, they argue, clouds will stabilize the climate and prevent climate change from occurring (i.e., clouds will provide a negative feedback).

In my new paper, I calculate the energy trapped by clouds and observe how it varies as the climate warms and cools during El Nino-Southern Oscillation (ENSO) cycles. I find that, as the climate warms, clouds trap an additional 0.54±0.74W/m2 for every degree of warming. Thus, the cloud feedback is likely positive, but I cannot rule out a slight negative feedback.
More »