The difference between a single calculation and a solid paper in the technical literature is vast. A good paper examines a question from multiple angles and find ways to assess the robustness of its conclusions to all sorts of possible sources of error — in input data, in assumptions, and even occasionally in programming. If a conclusion is robust over as much of this as can be tested (and the good peer reviewers generally insist that this be shown), then the paper is likely to last the test of time. Although science proceeds by making use of the work that others have done before, it is not based on the assumption that everything that went before is correct. It is precisely because that there is always the possibility of errors that so much is based on ‘balance of evidence’ arguments’ that are mutually reinforcing.
So it is with the Steig et al paper published last week. Their conclusions that West Antarctica is warming quite strongly and that even Antarctica as a whole is warming since 1957 (the start of systematic measurements) were based on extending the long term manned weather station data (42 stations) using two different methodologies (RegEM and PCA) to interpolate to undersampled regions using correlations from two independent data sources (satellite AVHRR and the Automated Weather Stations (AWS) ), and validations based on subsets of the stations (15 vs 42 of them) etc. The answers in each of these cases are pretty much the same; thus the issues that undoubtedly exist (and that were raised in the paper) — with satellite data only being valid on clear days, with the spottiness of the AWS data, with the fundamental limits of the long term manned weather station data itself – aren’t that important to the basic conclusion.
One quick point about the reconstruction methodology. These methods are designed to fill in missing data points using as much information as possible concerning how the existing data at that point connects to the data that exists elsewhere. To give a simple example, if one station gave readings that were always the average of two other stations when it was working, then a good estimate of the value at that station when it wasn’t working, would simply be the average of the two other stations. Thus it is always the missing data points that are reconstructed; the process doesn’t affect the original input data.
This paper clearly increased the scrutiny of the various Antarctic data sources, and indeed the week, errors were found in the record from the AWS sites ‘Harry’ (West Antarctica) and ‘Racer Rock’ (Antarctic Peninsula) stored at the SCAR READER database. (There was a coincidental typo in the listing of Harry’s location in Table S2 in the supplemental information to the paper, but a trivial examination of the online resources — or the paper itself, in which Harry is shown in the correct location (Fig. S4b) — would have indicated that this was indeed only a typo). Those errors have now been fixed by the database managers at the British Antarctic Survey.
Naturally, people are interested on what affect these corrections will have on the analysis of the Steig et al paper. But before we get to that, we can think about some ‘Bayesian priors‘. Specifically, given that the results using the satellite data (the main reconstruction and source of the Nature cover image) were very similar to that using the AWS data, it is highly unlikely that a single station revision will have much of an effect on the conclusions (and clearly none at all on the main reconstruction which didn’t use AWS data). Additionally, the quality of the AWS data, particularly any trends, has been frequently questioned. The main issue is that since they are automatic and not manned, individual stations can be buried in snow, drift with the ice, fall over etc. and not be immediately fixed. Thus one of the tests Steig et al. did was a variation of the AWS reconstruction that detrended the AWS data before using them – any trend in the reconstruction would then come solely from the higher quality manned weather stations. The nature of the error in the Harry data record gave an erroneous positive trend, but this wouldn’t have affected the trend in the AWS-detrended based reconstruction.
Given all of the above, the Bayesian prior would therefore lean towards the expectation that the data corrections will not have much effect.
The trends in the AWS reconstruction in the paper are shown above. This is for the full period 1957-2006 and the dots are scaled a little smaller than they were in the paper for clarity. The biggest dot (on the Peninsula) represents about 0.5ºC/dec. The difference that you get if you use detrended data is shown next.
As we anticipated, the detrending the Harry data affects the reconstruction at Harry itself (the big blue dot in West Antarctica) reducing the trend there to about 0.2°C/dec, but there is no other significant effect (a couple of stations on the Antarctica Peninsula show small differences). (Note the scale change from the preceding figure — the blue dot represents a change of 0.2ºC/dec).
Now that we know that the trend (and much of the data) at Harry was in fact erroneous, it’s useful to see what happens when you don’t use Harry at all. The differences with the original results (at each of the other points) are almost undetectable. (Same scale as immediately above; if the scale in the first figure were used, you couldn’t see the dots at all!).
In summary, speculation that the erroneous trend at Harry was the basis of the Antarctic temperature trends reported by Steig et al. is completely specious, and could have been dismissed by even a cursory reading of the paper.
However, we are not yet done. There was erroneous input data used in the AWS reconstruction part of the study, and so it’s important to know what impact the corrections will have. Eric managed to do some of the preliminary tests on his way to the airport for his Antarctic sojourn and the trend results are as follows:
There is a big difference at Harry of course – a reduction of the trend by about half, and an increase of the trend at Racer Rock (the error there had given an erroneous cooling), but the other points are pretty much unaffected. The differences in the mean trends for Antarctica, or WAIS are very small (around 0.01ºC/decade), and the resulting new reconstruction is actually in slightly better agreement with the satellite-based reconstruction than before (which is pleasing of course).
Bayes wins again! Or should that be Laplace? ;)
Update (6/Feb/09):The corrected AWS-based reconstruction is now available. Note that the main satellite-based reconstruction is unaffected by any issues with the AWS stations since it did not use them.