Are the CRU data “suspect”? An objective assessment.

Kevin Wood, Joint Institute for the Study of the Atmosphere and Ocean, University of Washington

Eric Steig, Department of Earth and Space Sciences, University of Washington

In the wake of the CRU e-mail hack, the suggestion that scientists have been hiding the raw meteorological data that underpin global temperature records has appeared in the media. For example, New York Times science writer John Tierney wrote, “It is not unreasonable to give outsiders a look at the historical readings and the adjustments made by experts… Trying to prevent skeptics from seeing the raw data was always a questionable strategy, scientifically.”

The implication is that something secretive and possibly nefarious has been afoot in the way data have been handled, and that the validity of key data products (especially those produced by CRU) is suspect on these grounds. This is simply not the case.

It may come as a surprise to some that the first compilation of world-wide meteorological data was published by the Smithsonian Institution in 1927, long before anthropogenic climate change emerged as an important issue (Clayton et al., 1927). This volume is still widely available on the library shelf as are updates that were issued periodically. This same data collection provided the foundation for the World Monthly Surface Station Climatology, 1738-cont. As has been the case for many years, any interested party can access this from UCAR ( and other electronic data archives.

Now, it is well known that these data are not perfect. Most records are not as complete as could be wished. Errors periodically creep in and have to be identified and weeded out. But beyond the simple errors of the key-entry type there are inevitably discontinuities or inhomogeneities introduced into the records due to changes in observing practices, station environment, or other non-meteorological factors. It is very unlikely there is any historical record in existence unaffected by this issue.

Filtering inhomogeneities out of meteorological data is a complicated procedure. Coherent surface air temperature (SAT) datasets like those produced by CRU also require a procedure for combining different (but relatively nearby) record fragments. However, the methods used to undertake these unavoidable tasks are not secret: they have been described in an extensive literature over many decades (e.g. Conrad, 1944; Jones and Moberg, 2003; Peterson et al., 1998, and references therein). Discontinuities may nevertheless persist in data products, but when they are found they are published (e.g. Thompson et al., 2008).

Furthermore, it is a fairly simple exercise to extract the grid-box temperatures from a CRU dataset—CRUTEM3v for example—and compare it to raw data from World Monthly Surface Station Climatology. CRU data are available from One should not expect a perfect match due to the issues described above, but an exercise like this does provide a simple way to evaluate the extent to which the CRU data represent the underlying raw data. In particular, it would presumably be of interest to know whether the trends in the CRU data are very different than the trends in the raw data, since this could be taken as indication that the methods used by CRU result in an overstatement of the evidence for global warming.

As an example, we extracted a sample of raw land-surface station data and corresponding CRU data. These were arbitrarily selected based on the following criteria: the length of record should be ~100 years or longer, and the standard reference period 1961–1990 (used to calculate SAT anomalies) must contain no more than 4 missing values. We also selected stations spread as widely as possible over the globe. We randomly chose 94 out of a possible 318 long records. Of these, 65 were sufficiently complete during the reference period to include in the analysis. These were split into two groups of 33 and 32 stations (Set A and Set B), which were then analyzed separately.

Results are shown in the following figures. The key points: both Set A and Set B indicate warming with trends that are statistically identical between the CRU data and the raw data (>99% confidence); the histograms show that CRU quality control has, as expected, narrowed the variance (both extreme positive and negative values removed).


Page 1 of 2 | Next page