Once more unto the breach, dear friends, once more!
Some old-timers will remember a series of ‘bombshell’ papers back in 2004 which were going to “knock the stuffing out” of the consensus position on climate change science (see here for example). Needless to say, nothing of the sort happened. The issue in two of those papers was whether satellite and radiosonde data were globally consistent with model simulations over the same time. Those papers claimed that they weren’t, but they did so based on a great deal of over-confidence in observational data accuracy (see here or here for how that turned out) and an insufficient appreciation of the statistics of trends over short time periods.
Well, the same authors (Douglass, Pearson and Singer, now joined by Christy) are back with a new (but necessarily more constrained) claim, but with the same over-confidence in observational accuracy and a similar lack of appreciation of short term statistics.
Previously, the claim was that satellites (in particular the MSU 2LT record produced by UAH) showed a global cooling that was not apparent in the surface temperatures or model runs. That disappeared with a longer record and some important corrections to the processing. Now the claim has been greatly restricted in scope and concerns only the tropics, and the rate of warming in the troposphere (rather than the fact of warming itself, which is now undisputed).
The basis of the issue is that models produce an enhanced warming in the tropical troposphere when there is warming at the surface. This is true enough. Whether the warming is from greenhouse gases, El Nino’s, or solar forcing, trends aloft are enhanced. For instance, the GISS model equilibrium runs with 2xCO2 or a 2% increase in solar forcing both show a maximum around 20N to 20S around 300mb (10 km):
The first thing to note about the two pictures is how similar they are. They both have the same enhancement in the tropics and similar amplification in the Arctic. They differ most clearly in the stratosphere (the part above 100mb) where CO2 causes cooling while solar causes warming. It’s important to note however, that these are long-term equilibrium results and therefore don’t tell you anything about the signal-to-noise ratio for any particular time period or with any particular forcings.
If the pictures are very similar despite the different forcings that implies that the pattern really has nothing to do with greenhouse gas changes, but is a more fundamental response to warming (however caused). Indeed, there is a clear physical reason why this is the case – the increase in water vapour as surface air temperature rises causes a change in the moist-adiabatic lapse rate (the decrease of temperature with height) such that the surface to mid-tropospheric gradient decreases with increasing temperature (i.e. it warms faster aloft). This is something seen in many observations and over many timescales, and is not something unique to climate models.
If this is what should be expected over a long time period, what should be expected on the short time-scale available for comparison to the satellite or radiosonde records? This period, 1979 to present, has seen a fair bit of warming, but also a number of big El Niño events and volcanic eruptions which clearly add noise to any potential signal. In comparing the real world with models, these sources of additional variability must be taken into account. It’s straightforward for the volcanic signal, since many simulations of the 20th century done in support of the IPCC report included volcanic forcing. However, the occurrence of El Niño events in any model simulation is uncorrelated with their occurrence in the real world and so special care is needed to estimate their impact.
Additionally, it’s important to make a good estimate of the uncertainty in the observations. This is not simply the uncertainty in estimating the linear trend, but the more systematic uncertainty due to processing problems, drifts and other biases. One estimate of that error for the MSU 2 product (a weighted average of tropospheric+lower stratospheric trends) is that two different groups (UAH and RSS) come up with a range of tropical trends of 0.048 to 0.133 °C/decade – a much larger difference than the simple uncertainty in the trend. In the radiosonde records, there is additional uncertainty due to adjustments to correct for various biases. This is an ongoing project (see RAOBCORE for instance).
So what do Douglass et al come up with?
Superficially it seems clear that there is a separation between the models and the observations, but let’s look more closely….
First, note that the observations aren’t shown with any uncertainty at all, not even the uncertainty in defining a linear trend – (roughly 0.1°C/dec). Secondly, the offsets between UAH, RSS and UMD should define the minimum systematic uncertainty in the satellite observations, which therefore would overlap with the model ‘uncertainty’. The sharp eyed among you will notice that the satellite estimates (
even UAH Correction: the UAH trends are consistent (see comments)) – which are basically weighted means of the vertical temperature profiles – are also apparently inconsistent with the selected radiosonde estimates (you can’t get a weighted mean trend larger than any of the individual level trends!).
It turns out that the radiosonde data used in this paper (version 1.2 of the RAOBCORE data) does not have the full set of adjustments. Subsequent to that dataset being put together (Haimberger, 2007), two newer versions have been developed (v1.3 and v1.4) which do a better, but still not perfect, job, and additionally have much larger amplification with height. For instance, look at version 1.4:
The authors of Douglass et al were given this last version along with the one they used, yet they only decided to show the first (the one with the smallest tropical trend) without any additional comment even though they knew their results would be less clear.
But more egregious by far is the calculation of the model uncertainty itself. Their description of that calculation is as follows:
For the models, we calculate the mean, standard deviation (sigma), and estimate of the uncertainty of the mean (sigma_SE) of the predictions of the trends at various altitude levels. We assume that sigma_SE and standard deviation are related by sigma_SE = sigma/sqrt(N – 1), where N = 22 is the number of independent models. ….. Thus, in a repeat of the 22-model computational runs one would expect that a new mean that would lie between these limits with 95% probability.
The interpretation of this is a little unclear (what exactly does the sigma refer to?), but the most likely interpretation, and the one borne out by looking at their Table IIa, is that sigma is calculated as the standard deviation of the model trends. In that case, the formula given defines the uncertainty on the estimate of the mean – i.e. how well we know what the average trend really is. But it only takes a moment to realise why that is irrelevant. Imagine there were 1000’s of simulations drawn from the same distribution, then our estimate of the mean trend would get sharper and sharper as N increased. However, the chances that any one realisation would be within those error bars, would become smaller and smaller. Instead, the key standard deviation is simply sigma itself. That defines the likelihood that one realisation (i.e. the real world) is conceivably drawn from the distribution defined by the models.
To make this even clearer, a 49-run subset (from 18 models) of the 67 model runs in Douglass et al was used by Santer et al (2005). This subset only used the runs that included volcanic forcing and stratospheric ozone depletion – the most appropriate selection for this kind of comparison. The trends in T2LT can be used as an example. I calculated the 1979-1999 trends (as done by Douglass et al) for each of the individual simulations. The values range from -0.07 to 0.426 °C/dec, with a mean trend of 0.185 °C/dec and a standard deviation of 0.113 °C/dec. That spread is not predominantly from uncertain physics, but of uncertain noise for each realisation.
From their formula the Douglass et al 2 sigma uncertainty would be 2*0.113/sqrt(17) = 0.06 °C/dec. Yet the 10 to 90 percentile for the trends among the models is 0.036–0.35 °C/dec – a much larger range (+/- 0.19 °C/dec) – and one, needless to say, that encompasses all the observational estimates. This figure illustrates the point clearly:
What happens to Douglass’ figure if you incorporate the up-dated radiosonde estimates and a reasonable range of uncertainty for the models? This should be done properly (and could be) but assuming the slight difference in period for the RAOBCORE v1.4 data or the selection of model runs because of volcanic forcings aren’t important, then using the standard deviations in their Table IIa you’d end up with something like this:
Not quite so impressive.
To be sure, this isn’t a demonstration that the tropical trends in the model simulations or the data are perfectly matched – there remain multiple issues with moist convection parameterisations, the Madden-Julian oscillation, ENSO, the ‘double ITCZ’ problem, biases, drifts etc. Nor does it show that RAOBCORE v1.4 is necessarily better than v1.2. But it is a demonstration that there is no clear model-data discrepancy in tropical tropospheric trends once you take the systematic uncertainties in data and models seriously. Funnily enough, this is exactly the conclusion reached by a much better paper by P. Thorne and colleagues. Douglass et al’s claim to the contrary is simply unsupportable.