Many readers will remember our critique of a paper by Douglass et al on tropical tropospheric temperature trends late last year, and the discussion of the ongoing revisions to the observational datasets. Some will recall that the Douglass et al paper was trumpeted around the blogosphere as the definitive proof that models had it all wrong.
At the time, our criticism was itself criticised because our counterpoints had not been submitted to a peer-reviewed journal. However, this was a little unfair (and possibly a little disingenuous) because a group of us had in fact submitted a much better argued paper making the same principal points. Of course, the peer-review process takes much longer than writing a blog post and so it has taken until today to appear on the journal website.
The new 17-author paper (accessible pdf) (lead by Ben Santer), does a much better job of comparing the various trends in atmospheric datasets with the models and is very careful to take account of systematic uncertainties in all aspects of that comparison (unlike Douglass et al). The bottom line is that while there is remaining uncertainty in the tropical trends over the last 30 years, there is no clear discrepancy between what the models expect and the observations. There is a fact sheet available which explains the result in relatively simple terms.
Additionally, the paper explores the statistical properties of the test used by Douglass et al and finds some very odd results. Namely, that their test should nominally inadvertently reject a match 1 time out 20 (i.e. for a 5% significance), actually rejects valid comparisons 16 times out of 20! And curiously, the more data you have, the worse the test performs (figure 5 in the paper). The other aspect discussed in the paper is the importance of dealing with systematic errors in the data sets. These are essentially the same points that were made in our original blog post, but are now much more comprehensively shown. The data sources are now completely up-to-date and a much wider range of sources is addressed – not only the different satellite products, but also the different analyses of the radiosonde data.
The grey band is the real 2-sigma spread of the models (while the yellow band is the spread allowed for in the flawed Douglass et al test). The other lines are the different estimates from the data. The uncertainties in both preclude any claim of some obvious discrepancy – a result you can only get by cherry-picking what data to use and erroneously downplaying the expected spread in the simulations.
Taking a slightly larger view, I think this example shows quite effectively how blogs can play a constructive role in moving science forward (something that we discussed a while ago). Given the egregiousness of the error in this particular paper (which was obvious to many people at the time), having the initial blog posting up very quickly alerted the community to the problems even if it wasn’t a comprehensive analysis. The time in-between the original paper coming out and this new analysis was almost 10 months. The resulting paper is of course much better than any blog post could have been and in fact moves significantly beyond a simple rebuttal. This clearly demonstrates that there is no conflict between the peer-review process and the blogosphere. A proper paper definitely takes more time and gives generally a better result than a blog post, but the latter can get the essential points out very quickly and can save other people from wasting their time.