# On replication

In all of the data used, there are ongoing updates to the raw data. For the temperature records, there are variations over time in the processing algorithms (satellites as well as surface stations), for emissions and economic data, updates in reporting or estimation, and in all cases the correction of errors is an ongoing process. Since my interest was in how robust the analyses were, I spent some time reprocessing the updated datasets. This involved downloading the EDGAR3 data, the latest UAH MSU numbers, the latest CRUTEM2/HadCRU2v numbers, and alternative versions of the same (such as the RSS MSU data, HadCRUT3v, GISTEMP). In many cases, these updates are in different formats, have different ‘masks’ and required specific and unique processing steps. Given the complexity of (and my unfamiliarity with) of economic data, I did not attempt to update that, or even ascertain whether updates had occurred.

In these two papers then, we have two of the main problems often alluded to. It is next-to-impossible to recreate exactly the calculation used in dLM07 since the data sets have changed in the meantime. However, since my scientific interest is in what their analysis says about the real world, any conclusion that was not robust to that level of minor adjustment would not have been interesting. By redoing their calculations with the current data, or with different analyses of analogous data, it is very easy to see that there is no such dependency, and thus reproducing their exact calculation becomes moot. In the MM07 case, it is very difficult for someone coming from the climate side to test the robustness of their analysis to updates in economic data and so that wasn’t done. Thus while we have the potential for an exact replication, we are no wiser about its robustness to possibly important factors. I however was able to easily test the robustness of their calculations to changes in the satellite data source (RSS vs. UAH) or to updates in the surface temperature products.

**Processing**

MM07 used an apparently widespread statistics program called STATA and archived a script for all of their calculations. While this might have been useful for someone familiar with this proprietary software, it is next to useless for someone who doesn’t have access to it. STATA scripts are extremely high level, implying they are easy to code and use, but since the underlying code in the routines is not visible or public, they provide no means by which to translate the exact steps taken into a different programming language or environment. However, the calculations mainly consisted of multiple linear regressions which is a standard technique, and so other packages are relatively easily available. I’m an old-school fortran programmer (I know, I know), and so I downloaded a fortran package that appeared to have the same functionality and adapted it to my needs. Someone using Matlab or R could have done something very similar. It was a simple matter to then check that the coefficients from my calculation and that in MM07 were practically the same and that there was a one-to-one match in the nominal significance (which was also calculated differently). This also provides a validation of the STATA routines (which I’m sure everyone was concerned about).

The processing in dLM06 was described plainly in their paper. The idea is to define area masks as a function of the emissions data and calculate the average trend – two methods were presented (averaging over the area then calculating the trend, or calculating the trends and averaging them over the area). With complete data these methods are equivalent, but not quite when there is missing data, though the uncertainties in the trend are more straightforward in the first case. It was pretty easy to code this up myself so I did. Turns out that the method used in dLM07 was not the one they said, but again, having coded both, it is easy to test whether that was important (it isn’t).

**Replication**

Given the data from various sources, my own codes for the processing steps, I did a few test cases to show that I was getting basically the same results in the same circumstances as was reported in the original papers. That worked out fine. Had their been any further issues at this point, I would have sent out a couple of emails, but this was not necessary. Jos de Laat had helpfully replied to two previous questions (concerning what was included in the emissions and the method used for the average trend), and I’m sure he or the other authors involved would have been happy to clarify anything else that might have come up.

Are we done? Not in the least.

**Science**

Page 2 of 3 | Previous page | Next page