Are the CRU data “suspect”? An objective assessment.

Comparison of CRUTEM3v data with raw station data taken from World Monthly Surface Station Climatology. On the left are the mean temperature anomalies from each pair of randomly chosen times series. On the right are the distribution of trends in those time series and their means and standard errors. (The standard error provides an estimate of how well the sampling of ~30 stations represents the full global data set assuming a Gaussian distribution.) Note that not all the trends are for identical time periods, since not all data sets are the same length.

Conclusion: There is no indication whatsoever of any problem with the CRU data. An independent study (by a molecular biologist it Italy, as it happens) came to the same conclusion using a somewhat different analysis. None of this should come as any surprise of course, since any serious errors would have been found and published already.

It’s worth noting that the global average trend obtained by CRU for 1850-2005, as reported by the IPCC (, 0.47 0.54 degrees/century,* is actually a bit lower (though not by a statistically significant amount) than we obtained on average with our random sampling of stations.

*See table 3.2 in IPCC WG1 report.


Clayton, H. H., F. M. Exner, G. T. Walker, and C. G. Simpson (1927), World weather records, collected from official sources, in Smithsonian Miscellaneous Collections, edited, Smithsonian Institution, Washington, D.C.

Conrad, V. (1944), Methods in Climatology, 2nd ed., 228 pp., Harvard University Press, Cambridge.

Jones, P. D., and A. Moberg (2003), Hemispheric and large-scale surface air temperature variations: An extensive revision and an update to 2001, Journal of Climate, 16, 206-223.

Peterson, T. C., et al. (1998), Homogeneity adjustments of in-situ atmospheric climate data: a review, International Journal of Climatology, 18, 1493-1517.

Thompson, D. W. J., J. J. Kennedy, J. M. Wallace, and P. D. Jones (2008), A large discontinuity in the mid-twentieth century in observed global-mean surface temperature, Nature, 453(7195), 646-649.

Page 2 of 2 | Previous page

242 comments on this post.
  1. thingsbreak:

    RE: “This is just bandwagoning and the flinging around of baseless accusations by a right-wing think tank”

    Gavin, I surely hope you’re not somehow suggesting that the “skeptical” blogs are conflating a CATO Institute Senior Fellow’s/anti-regulation think tank’s opinion with that of the entire nation of Russia– oh, no wait, that’s exactly what they did. Carry on.

  2. Ken W:

    ZT (143):
    “If urbanization is not effecting thermometer readings – why is the land temperature increasing faster than the ocean? Surely, if the global temperature was increasing – would not both land and sea increase at the same rate?”

    You can clearly see the land warming faster than the ocean in this GISS dataset plot:

    This is exactly what one should expect, because it takes so much more energy to raise the temperature of water than land or air. That’s why inland climates experience more extreme temperature ranges than coastal climates. Even though there is wind and constant exchanges of energy between the surface of the ocean and the atmosphere, things aren’t in an equilibrium state. There are many complex factors (e.g. evaporation which loses heat energy into the atmosphere, circulation patterns of both wind and water, melting glaciers pouring cold water into the oceans, etc.) which will also effect the measured ocean surface warming.

    There actually is an urban warming effect. Temperatures are warmer in urban areas than surrounding rural areas (another human factor in climate change), but as has been shown in previous links they don’t bias the computed global surface temperature. Those urban effected temperature measurements (i.e. the ones photographed next to air conditioners or surrounded by blacktop) are either eliminated or corrected using nearby rural measurements.

  3. dhogaza:

    This is just bandwagoning and the flinging around of baseless accusations by a right-wing think tank

    From a country (Russia) banking a large part of its economic future on increasing sales of oil and natural gas to Europe.

  4. Ken W:

    kdk33 (198):
    “Stick to the satellite data. It’s been warming some. Big deal.
    But why? – that’s the question.”

    Primarily because of increased atmospheric CO2 from human burning of fossil fuels.

  5. DAS:

    @ #49 ZZT: “Question – how do we know that these stations are not simply measuring the growing proximity of asphalt and air conditioning units with time?”

    Wow, you’ve hit on something there alright(roll eyes)…more asphalt + more AC equipment = higher readings…DUH! The sea level actually only exists to play the part of a giant mercury thermometer: more heat and pollution from industry, human activity, and energy usage causes level to rise.

  6. Dave C:

    Is anybody else amused by the fact that kdk33 (#198) seems to think that by employing a bit of sophistry, hand-waving and no data at all he can refute the findings of an entire field of research?

  7. Kevin McKinney:

    Norman, you may be right in your comments, but nevertheless the amount of energy “coming in” to the climate system in the highest latitudes in winter is mostly due to physical transport–air masses & ocean currents. The absorption of IR you describe is–subject to the correction of someone who understands this in depth, which I don’t claim to–much de-emphasized in winter as compared with summer.

    I’m pretty darn sure that the seasonal differential in warming is empirically well supported. This quote is from a story on a 2003 paper, but I think there’s more recent data as well, could I but take a bit more time to search:

    “Most importantly, temperatures increased on average by 1.22 degrees Celsius per decade over sea ice during Arctic summer. The summer warming and lengthened melt season appears to be affecting the volume and extent of permanent sea ice. Annual trends, which were not quite as strong, ranged from a warming of 1.06 degrees Celsius over North America to a cooling of .09 degrees Celsius in Greenland.”

    Interesting and recent, but not, sadly, quite to the point we’ve been discussing:

    Claims to show human attribution of the polar warming.

    On the other hand, this paper says I’m completely wrong:

    Says that cloud forcings create more amplification in the winter than in summer. How you reconcile that with the 2003 paper, I’m not sure. (This layman is headed for bed, before he looks even more foolish.)

    Must read more about this. . .

  8. Terry:

    Gavin, I dont wish to be a pest, but what is the official line on my question @195. From a purely scientific data QA perspective it surely makes sense. Cheers

    [Response: 100 would do it – if you knew they were perfect. Since you don’t, you need redundancy so that you can check for outliers and jumps etc. However you don’t need 1000s of stations to get a good estimate of the global trends – which is why it doesn’t much matter how you cut it, you end up with the same thing. But, the more the merrier when it comes down to pinning down regional patterns – they do require more data overall. – gavin]

  9. DVG:

    Ray Ladbury@169: Wow, religious conviction certainly runs deep.

  10. John P. Reisman (OSS Foundation):

    #193 Ron R.

    Sorry about the 502 messages. The site is pretty busy and when the ram overloads it shuts down because it is on a shared server. There is a cron job that reboots it after 5 minutes and I am actively looking for a new host for the site that gives me more ram headroom.

    I uploaded the 2007 image of the passage so now there is 2007 and 2008.

    If you are interested in helping build new image sections contact me:

    I can use all the help I can get.

  11. Cardin Drake:

    This post is a good starting point. But CRU can’t just bunker down and wait for all this to go away. Any fool can do this analysis, but I’m afraid that CRU will have to start from their own raw data and show step by step how they got to their adjusted data before they will have any credibility. And if you think about it, that is not too much to ask of them.
    Legitimate questions remain. For the NOAA data set, it doesn’t make sense that the adjustments wind up .5 degrees higher than the raw data. Intuitively, the UHI effect would make you expect that the adjustments would net out lower, or at least neutral.

  12. Terry:

    Re Gavin @208, Sure the more the merrier and redundancy is a good objective, but my question still remains as to why use contaminated sites that need correction, if there are enuf uncontaminated ones already available. I also understand that the latest CO2 satellite data indicates that regional effects are now clearly very important, but again surely there must be sufficient clean sites to do the job.

  13. Edward Greisch:

    84 Dan e Bloom: By 2500 we will either be extinct or living on Mars, asteroids, the moons of Jupiter, the moons of Saturn etc. Polar cities are still a non-starter. Your time scale is off by a multiple of 10. Things are going to happen 10 times as fast as you imagine. Maybe faster.

  14. Alan of Oz:

    Great post, simple logic and usefull links.

  15. CM:

    Steve Fish (#188), thanks for the explanation. I’m in the humanities, and don’t know much about actually working with numbers. I comfort myself that at least I’m aware of my limitations, unlike some (most?) of the R script kiddies suddenly out there doing private investigations of weather station records and FIX_ME notes. This kind of post, with guidance on how they could at least make a meaningful junior science-fair project out of it, is a very good idea.

  16. sHx:

    For the love of all things green and cool, could someone who is in-the-know tell me what is the percentage of the data the CRU used for its climate modeling that is NOT covered by the confidentiality agreements? This question will be moot in six or seven months’ time since the UK Met service announced that they’ll be seeking permissions from other Met services in order to release all of its raw data, but it is extremely annoying to see a few evidently non-climate scientists seeking mileage out of the debate by bandying the figures of 95 and 98 percent.

    More than two weeks ago, in another thread, I took the issue with a commenter named Marco, in the context of the legitimacy of the FOI requests made by Steven McIntyre, when Marco claimed that “the confidentiality agreements covered only 2% of the data, which carries very limited extra information”. Marco later claimed that the figure of 2% was “notably” in McIntyre own request for the FOI. He did not provide any citation to substantiate his claim.

    And today we have dhogaza saying at #185:

    The information I’ve read is that the GHCN database includes all the available data that’s not restricted by distribution agreements. This is what GISTEMP uses, and represents about 95% of the data used by CRU, the other 5% being the data subject to such agreement and which McIntyre et al have been screaming about being “hidden” for all of these years.

    and at #186:

    Oh, Jonesy, and as far as that other 5% that’s restricted and used when creating HadCRUT (but not NASA GISTEMP), the data can be had from the individual countries, though unfortunately often for a fee.

    See, McIntyre et al have been asking for restricted access data for free that in some cases CRU had paid for, rather than going to the source and paying for it themselves. Understandable, who wants to pay for data? And not all of the restricted data has been paid for, but still – you do get the point, I hope?

    I skip the blindingly obvious questions of “where did you read the figure of 95%?” and “do you want me to believe that Steve McIntyre would not spare a few lousy dollars to buy the missing data from the gazillions that he was supposed to be receiving from the fossil fuel industry?” I skip these questions because this time I want to hear from climate scientists, not from cheap propagandists.

    What is the percantage of data used by the CRU that is NOT covered by the confidentiality agreements? What is the weighing of the ‘missing bits’ on the CRU’s climate models? Has Phil Jones ever made an attempt to release the data not covered by confidentiality agreements AND provide the names of those Met services that sold the rest?

    I am not interested in the code for the climate model, or Jones’s right to scientific vainglory, or whether there are other sources proving the AGW. My questions are only about the CRU data because specific figures of 95% and 98% are being spread around, without any citation, as the percentage of the freely available CRU data, and attacks are made on others on this basis. The answers to these questions are important to me not because I intend to replicate the CRU’s climate science in my bedroom on my Pentium 4 PC but because I want to assess the credibility of various claimants like Phil Jones, Steve McIntyre, Marco and dhogaza.

  17. Barton Paul Levenson:

    Bruce H. Foerster: The largest source of greenhouse gases is from the raising of livestock for human consumption which, according to very reliable source ( Worldwatch Institute ) accounts for most ( 51% ) of all green house gas emissions!

    BPL: Maybe 51% of methane emissions, but the vast amount of artificial CO2 emitted is from burning fossil fuels.

  18. Completely Fed Up:

    “From a country (Russia) banking a large part of its economic future on increasing sales of oil and natural gas to Europe.”

    More like

    From a company mounthpiece rather like the American Petrolium Institute, the Cato Institute or Heartland Institute.

    There are fewer differences between Russians and the West than the US are generally comfortable with. Unfortunately, there are fewer differences between Russians and the West than many USians would hope for.

  19. Pekka Kostamo:

    #195 Terry:
    Perhaps a good starting point is to take a look at the international monitoring networks, as presented by the World Meteorological Organization (WMO). With some patience, you find details of the various station networks as well as the multiple parameters used in monitoring the climate.

    The climate observation stations are, of course, just a subset of the World Weather Watch Global Observations System, an overview of which is conveniently available at:

    Further pages of the WMO give details of data collection subsystems, the messaging and measurement standards etc., developed over the 150 years that a global effort to predict the weather has existed.

    It is a huge system, serving great many purposes both locally and internationally. Climate monitoring is just one aspect of it.

  20. Completely Fed Up:

    sHx prvides wisdom:

    “A) Why will there not be a North Pole by 2500 AD? Is it because it will float on dwindling ice floes and eventually go extinct like polar bears?”

    Nope, because the north pole is an ocean area with ice on top.

    I would not think thinning floating ice is a good place to build your heated city.

    And by 2500 there won’t be any ice to build on there anyway.

  21. Completely Fed Up:

    “Does this Gaussian distribution have a constant average value over time?”

    In the same way as a FFT does.

  22. Anne van der Bom:

    16 December 2009 at 12:45 PM

    If urbanization is not effecting thermometer readings

    I would interpret Kens words “the myth is that asphalt has biased the temperature readings” as: “the myth is that asphalt has biased the temperature record“.

    Urbanization IS affecting thermometer readings, but it is accounted for in the final temperature records.

    Problem with many blog scientists weighing in on climate change is that they want to overcorrect so the positive trend goes away.

  23. Barton Paul Levenson:


    You caught me in careless phrasing. I meant, of course, “there will be no north polar ice cap in 2500.”

    As the the capabilities of human society in 2500–I, personally, expect human civilization to collapse almost completely within the next forty years. I doubt it will be back up in time to build self-sustaining Antarctic cities in 2500.

  24. Barton Paul Levenson:

    Joseph Sobry,

    The Indians don’t eat the cows because in sowing and harvest seasons, the cows pull the plows. There’s usually a good reason behind a seemingly irrational custom.

  25. Jiminmpls:

    #164 TH – Yes, in parts of the US, October was very cold indeed – and November was the 3rd warmest on record. Now December is starting off cold again. Western ski resorts had their earliest openings in 40 years in October, but were pretty much shut down for the big Thanksgiving holiday. Gosh, sounds like extremes in weather to me – which is exactly what those science guys have predicted.

    OTOH, globally, the pattern is quite different. October was the sixth warmest on record and November was the fourth warmest on record. Sept-Nov was the fourth warmest on record and Jan-Nov was the fifth warmest on record.

    Check out this site:

    It’s very easy to understand and you can start basing your opinions on factual information rather than bullsh1t.

  26. Snorbert Zangox:


    The article that I read appears to be in an English language version of a Russian newspaper. I do not know if that newspaper is associated with “a right-wing think tank” or not. However, that article says “The data of stations located in areas not listed in the Hadley Climate Research Unit Temperature UK (HadCRUT) survey often does not show any substantial warming in the late 20th century and the early 21st century.” That implies that more than 19th century data are affected. Scroll down to “Russia affected by Climategate”

    [Response: Well, journalism is in trouble in Russia too… The relevant figure from the IEA report is here which show the difference between the their ‘all station’ index and the HadCRU index for Russia as a whole. Since there was no check for inhomogeneities or jumps in the IEA index, it doesn’t stand up to much scrutiny, but even if it were fine, the differences in the 20th C trend are small. The whole ‘someone made adjustments/screened stations therefore fraud!’ line of argument is getting very old. – gavin]

  27. Ray Ladbury:

    DVG @209, Isn’t it interesting that creationists, climate change denialists and even smokers all immediately jump on the religious line like that. Maybe it is because they would never think of actually looking at the evidence and drawing conclusions based on that.

    The Wall Street Urinal used to be a decent paper. Now I wouldn’t line a birdcage with it. If I want news from an economic perspective, I read The Economist–and I do read it weekly. They at least acknowledge physical reality–unlike the Urinal where spin is supreme.

  28. Lynn Vincentnathan:

    This is a great post.

    I teach criminal justice research methods, and some of the basics are the same for all sciences, so I sometimes use issues in climate science as examples. I just finished teaching hypothesis testing — which is very difficult…it sounds like some double negative, cukoo way of doing things to novices…, as in “why would anyone want to establish a null hypothesis, when it’s the research hypothesis we’re interest in.”

    Anyway, I told them how climate denialists are always assaulting the scientists with “Have you considered that increasing solar activity may be warming the earth, or X, Y, or Z natural factors.” And the scientists reply, of course we have. That’s our null hypothesis. We only reject it when we are confident that the data can no longer be explained by those natural factors; there is a .05 or less probability that those natural factors can explain the data. ((Note that conscientious laypersons were not sitting around waiting for the null to get down to .05 probability, which happened in 1995, but were busy well beforehand reducing, reusing, recycling, going on alt energy, AND saving money in the process.))

    Anyway, in research methods there are all sorts of issues of validity (did you set your watch to the new time zone) and reliability (does your watch keep good time) — e.g. statistical conclusion validity, which requires enough data (years of temp findings) to reach statistical significance. Then there is internal validity, construct validity, external validity, criterion-related validity (proxies, etc), and so on.

    And it seems what the scientists were doing in those emails highlighted by the denialists was in a nutshell addressing and correcting problems relating to such validity and reliability threats. And, of course, laypersons with agendas can be easily convinced that sounds cukoo.

    Shame on the real scientists amongst the denialists who work against scientific understanding.

    [Response: Thanks. Very thoughtful and interesting.–eric]

  29. Patrik:

    Dear scientists,

    What is the main difference between Set A and Set B?
    Because Set A shows exactly what many sceptics are saying:
    That the 1930:s was just as “hot” as present day.

    [Response: A and B are two random samplings of a small fraction of the data so you wouldn’t expect them to match the global average perfectly. Some parts of the globe may have been as warm as present day; indeed, that is exactly what the data suggest.–eric]

  30. John P. Reisman (OSS Foundation):

    I summarized ‘CRU Data’ & ClimateGate based on the wonderful RC work from the 4 posts.

  31. Kevin McKinney:

    From Serreze et al, 2009:

    “As the climate warms, the summer melt season lengthens and intensifies, leading to less sea ice at summer’s end. Summertime absorption of solar energy in open water areas increases the sensible heat content of the ocean. Ice formation in autumn and winter, important for insulating the warm ocean from the cooling atmosphere, is delayed. This promotes enhanced upward heat fluxes, seen as strong warming at the surface and in the lower troposphere. This vertical structure of temperature change is enhanced by strong low-level stability which inhibits vertical mixing. Arctic amplification is not prominent in summer itself, when energy is used to melt remaining sea ice and increase the sensible heat content of the upper ocean, limiting changes in surface and lower troposphere temperatures. Loss of snow cover contributes to an amplified temperature response over northern land areas, but this temperature change is not as pronounced as over the ocean.”

    Comiso 2006 has a good discussion on satellite observations of polar warming/amplification (though not that much on the seasonality issue):

    On the DIY front, this NOAA page lets you create your own timeseries based on various data. I created this series based on NCEP reanalysis for latitudes north of 75. It clearly shows the amplified winter warming in the form of rapidly warming minima; the highs increase a bit too, but much, much less.

    Hmm, all this corrected information tastes a lot like crow. . . ;-)

  32. Ron R.:

    John P. Reisman, I atempted to contact you re: the pics but your contact page came back telling me that my email address is invalid. It’s not. Can you check that everything’s working right? Or if you want to leave your email address here I can contact you.

  33. alantrer:

    Re: # 156
    The context of Eric’s post was surface station temperature records (CRUTEMP3v). My comment was within that same context. I imply nothing with regard to radiosonde records.

    Re: # 166
    If it were a spelling error I feel certain my spell checker would have flagged it. Perhaps you meant semantics? I confess I don’t have one of those checkers. I’m clearly exposed to human error there.

    Re: # 167
    Reducing complexity of an argument reduces objections to it. In this case using the raw values would eliminate a large portion of the skeptics’ objections. As demonstrated, A’ as input achieves the same result (statistically) as A. If using A rather than A’ would reduce objections then it would make sense to use A.

    However this whole argument is moot. In # 194 Eric claims his post was not an attempt at a proof. It was simply a demonstration of how one could take a look at the data themselves. And in that respect there appears to be plenty of activity under way.

    [Response: A slight correction: Any thinking person can see they are wasting their time trying to get a different answer than I did. That’s the point of doing random sampling. The chances of getting a different answer than I did with the full data set is .01 percent. So, no, I didn’t prove anything, but I did demonstrate it with very very high confidence.–eric]

  34. Ron R.:

    I was wrong Steve Bloom #81. You’re right that NASA page has some good stuff on it, thanks.

    John, here’s a couple of direct links. I’d use the earliest and latest years available for comparison. BTW, they have larger sizes and animation available.

    On the Arctic: 1979 2008


    On the Greenland ice sheet melt: 1979 2007

  35. Silk:

    “What is the main difference between Set A and Set B?
    Because Set A shows exactly what many sceptics are saying:
    That the 1930:s was just as “hot” as present day. ”

    Not from my eyeballing.

    Some years in the 1930s were hot, but (eye-balling, because I don’t have the data) the last decade is clearly warmer on average than any previous decade.

    And eyeballing again, the trend is clearly upwards.

    Neither of which would please the sceptics.

  36. Ron R.:

    Five-Year Average Global Temperature Anomaly 1881-1885 2004-2008

  37. Snorbert Zangox:

    It does not look as if the inclusion of all stations changes the slope of the temperature/time curve for the last 30 years of the 20th century. I would like to see the UAH data for the same areas overlain on the curve that you linked. However, the inclusion of all stations for the full 145 years of temperature record would change the slope of the overall line significantly. (by about half if my eyeball estimate is close) This is because the full network data set shows higher temperatures for all years up until 1960.

    Also, keep in mind, if the contention of the Russians and others are correct, that CRU added temperature to the data during the first half of the 20th century, the slope of the overall line fall even more.

    I think that the approach of demonstrating that the putative changes did not happen by displaying selected data sets and by demonstrating that the differences are small will fail. I think that the only way to put this to rest is to return to the source data, display all of the code and repeat, publicly, derivation of the CRU adjusted data temperature set.

    [Response: If, when you say “this will fail” you mean that many people will continued to believe what is staring them in the face, yes, you are probably right. The question is whether it is worth anyone going through many years of work to find out something that is already known. Very few, if any scientists, are going to bother, since they know they are talking to a crowd of people that are apparently incapable of grasping the most basic statistics. If you mean that *you* don’t believe the results shown here address the issue, then I’m not sure what to say. Again: We did not show ‘selected’ data sets. We show randomly-selected data sets. There is a huge difference.–eric]

  38. ZZT:

    Many thanks for all the answers. Can anyone do a comparison of the A and B sets before the temperatures have been adjusted? If both sets show the same trend and statistical properties before adjustment, this would be quite convincing. If the adjustments are essentially random, they should introduce no net effect.

  39. Many Volcanos:

    I’m not a climate scientist but I do have some expertise with data. If I’m understanding this correctly climate scientists are trying to infer/predict future climate change based on only a few hundred years of sketchy data? Isn’t the earth 4 billion plus years old and has had weather basically since it’s had oceans? I’m not sure how any conclusions can be made with such a limited sampling. To say the Global Warming debate is concluded based on these findings seems premature…

  40. Kevan Hashemi:

    I don’t recall anybody is suggesting that CRU’s analysis of their own station data set is faulty. I’m a skeptic, and I applied a simpler version of their analysis to their data and obtained the same trend. So I’m not sure what criticism this article answers.

    Some people claim that CRU picked only sites that showed warming in Russia, rejecting 75% of the available stations. Other people claim that urban heating is a systematic error in the entire global set. My friends and I have been concerned about the dramatic changes in the number of weather stations available in the twentieth century, which you can see graphed against the CRU trend here. We even show that disappearing stations exhibit their own pronounced upward trend.

    We would very much like to hear your answers to these concerns about the CRU record. I’d like to see you show the plot of the number of weather stations versus anomaly and discuss with us why you believe the station number does not introduce a systematic error.

  41. Completely Fed Up:

    “based on only a few hundred years of sketchy data?”

    Why is this wrong?

    After all, engineering bases itself off the written record for engineering works rather than the 30,000 year record of human tool use (not to mention tool use by nonhumans!).

    Please justify calling it sketchy.

  42. eric:

    Comments are now closed.

    I get the last word.

    It really amazes me that some people asked “how do we trust your analysis of the CRU data that we don’t trust”? This is really grasping at straws, folks.

    But the point is: you don’t have to believe us. Go look at the data yourself.

    My favorite comment, #46:

    “What this shows, in my opinion, is that anyone who claims to have spent yeeeaaaars of his/her life studying the dark ways of the IPCC/NOAA/WMO/etc., and still cannot reproduce their results or still cannot understand/believe how the results were obtained, is full of sh#t…”

    Also, thanks to John Reismann for his excellent summary: