RealClimate logo

How easy it is to get fooled

When you analyse your data, you usually assume that you know what the data really represent. Or do you? This has been a question that over time has marred studies on solar activity and climate, and more recently cosmic rays and clouds. And yet again, this issue pops up in two recent papers; One by Feulner (‘The Smithsonian solar constant data revisited‘) and another by Legras et al. (‘A critical look at solar-climate relationships from long temperature series.’). Both these papers show how easily it is to be fooled by your data if you don’t know what they really represent.

First of all, I really think these papers are worth reading, because sometimes there are papers published that do not appreciate the importance of meta-data (information about the data) and do not question what they really represent.

Feulner demonstrates how the failure to adequately correct for seasonal variations and volcanic eruptions can lead to spurious results in terms of the brightness of the direct solar beam (pyrheliometry), the brightness of the sky in a ring around the Sun (pyranometry), and measured precipitable water content.

Such mistakes can easily give the impression that cosmic rays induce aerosol formation. Feulner’s work is a reanalysis of Weber (2010), which strongly suggested that cosmic rays cause a large part of atmospheric aerosol formation. We have already discussed cosmic rays and aeorsols (here), and similar claims have been made before. The new aspect of this includes analysis of pyranometry and pyrheliometry.

It is important to note that Feulner’s analysis focused on a relatively short period, 1923-1954, and that he only addressed parts of the analysis presented in the Weber (2010) paper (Weber examined the periods 1905 to 1954 and 1958 to 2008). I’m told that the Smithsonian (SAO; Smithsonian Astrophysical Observatory) data from 1905-23 are generally considered somewhat problematic due to instrument changes and calibration issues (I’m admit, I’m not expert on this issue).

Feulner also informs me that he has looked at the automatic measurements from Mauna Loa (1958-2008), but these were apparently problematic to analyse. They differ from the old SAO data because measurements taken during bad weather may be ‘contaminated’ and there may have been instrument failures. There is also a spurious drift over the whole period – possibly caused by anthropogenic water vapour and/or aerosols – and this period had frequent episodes of active volcanism.

One argument is that observation of low sunspot numbers (less than 50) tends to coincide with the winter (December-February) and spring (March-May), while large sunspot numbers tend to coincide with summer (June-July) and autumn (September -November). The coincidence between the seasons and sunspot numbers is purely coincidental. Feulner also asserts that there may be two simultaneous effects: (i) that the precipitable water content, pyranometry and the pyrheliometry measurements exhibit pronounced regular seasonal variations, and (ii) the seasonal distribution of sunspot numbers can give the impression of a change with solar activity.

Feulner also defined the years 1928-1931, 1932-1933, 1951-1952, and 1953-1955 as years with active volcanism (there is some discussion about such forcings here and here, but not for the same period). By accounting for these aspects, he subtracted the median of the measurements (precipitable water content, etc) for the corresponding calendar month for all the data, and excluded the years with volcanic activity. After accounting for seasonal bias and volcanic eruptions, Feulner finds no significant trend with the sunspot number, and that the solar activity influence is “comparatively small”.

Legras et al. refuted recent papers by Le Mouel et al. (2010) – referred to as ‘LMKC’ in their paper – and Kossobokov et al. (2010) on three counts: (1) By demonstrating that a correlation with solar forcing alone is meaningless unless other relevant factors are accounted for too (something Gavin and I also demonstrated in a JGR-article from 2009) (one sentence in their abstract may be interpreted as a strong statement which some people may find problematic: “sunspot counting is a poor indicator of solar irradiance”, but I too have some queries regarding the sunspot record); (2) demonstrating why long climate series must be homogeneous if one wants to study long term variability; (3) that incorrect application of statistical tests provide misleading results. They also provide their data and methods as a Mathematica notebook in the paper supplement, which I find commendable (I have not checked it, though) as sharing data and open source code make their arguments more convincing.

The analysis of Legras et al. focused on the maximum, minimum and mean daily temperature records from the ECA&D dataset representing Praha, (since 1775) and Bologna (1814) – data from Uccle, they say, were not available due to policy changes at the Belgium Met. Office. They also refer to an analysis by Wijngaard, saying that more than 94% of the [ECA&D] stations are flagged as ‘doubtful’ or ‘suspect’ over the period 1900-1999. Whereas the LMKC paper suggests that the chosen station records have the highest quality code in the ECA&D data set, Legras et al. presents a table where all the stations are listed as ‘suspect‘. Furthermore, they argue that the Bologna series exhibits a clear artefact: a large positive anomaly in daily maximum temperature greater than 2oC between 1865-1880, not seen in the daily minimum temperature, and hence, that the LMKC and related papers are all based on raw inhomogeneous data, contrary to what is claimed.

When it comes to the methodological flaws, Legras et al. argue that the LMKC doesn’t properly account for the real degrees of freedom in the data. They get an estimate for the effective degrees of freedom that is 9 times smaller than in LMKC, which leads to a much smaller estimate of confidence interval than Legras et al. get, also when they use a so-called ‘non-parametricpermutation test (this gets a bit technical/statistical). This means that the results in LMKC appear to have much greater significance than the real situation really tells us. Their results, rather, show some variations that are in accord with random statistical fluctuations.

The results of Feulner and Legras et al. convincing because of their careful analysis of the data and what they represent. They also explain why previous results and analysis are wrong, providing clear demonstrations showing how the methods work. Furthermore, they bring up well-established knowledge about the dangers associated with statistics and analysis, such as trend analysis of inhomogeneous data series. In essence, it is common sense that is it important to know what signals are hiding in your data, and how these can affect your analysis.

44 Responses to “How easy it is to get fooled”

  1. 1
    Lazarus says:

    Good Post.

    I think a lot of the time it isn’t so much getting fooled as seeing what you want to see. That is why a good standard or peer review is essential.

  2. 2
    tamino says:

    I took a good look at Legras et al., and downloaded some of the data (from ECA&D) to study it for myself. It’s quite clear: Legras et al. nailed it. Excellent work, and it’s an object lesson in just how hard it is to get things right.

    It’s also a powerful argument for “peer review” — not just to weed out papers that shouldn’t be published, but for other scientists to examine what does get published critically.

  3. 3
    Davos says:

    Seems to me another good manifestation as to why it is good that editors permit good papers to be published by good people who don’t agree with each other (including Steig and O’Donnell). There are lots of data sets in climate science with missing/contaminated entries, and sometimes scientists extrapolate sparse data across large distances to represent what is currently best thought to be accurate. I like that Legras et al attempted to supply their rendition of ‘what is’ and didn’t simply leave it at poking holes and highlighting uncertainty in someone else’s paper.

    It also seems clear from these papers that having experts in statistics as peer reviewers is a worth-while endeavor; moreso when they know the information about what the data actually is (instead of just numbers).

  4. 4
    skept says:

    rasmus : I think there is a problem with your http link
    “but I too have some queries regarding the sunspot record)”

    [Response:Thanks! It’s now fixed. -rasmus]

  5. 5
    Edward Greisch says:

    Wouldn’t you rather have somebody check your work before publishing so you don’t get embarrassed? Peer review is mostly good. At least it spreads the blame a little.

    reCaptcha is getting hard for humans.

  6. 6
    Tom Fowle says:

    Really interesting analysis. Wish there weren’t so many typo/grammar errors, though; makes reading cumbersome.

  7. 7
    Dan H. says:

    Yes, I always have several peers check my work before submission. This does give valuable feedback, and results in a much smoother peer-review process. I have been on both sides of the peer-review process, and have seen some really good and lousy papers submitted. I have also seen some very good and very lousy reviewers. This can arise partly from the reputation of the journal, and the quality of submitted papers. I have been to symposia where some less-than-quality papers were used to fill voids. Peer-review is not perfect, and should not treated as such, but it is still a good process.
    I agree with the recaptcha comment

  8. 8
    Biochar says:

    Re Edward, concerning captcha …

    Denier-bots live! Why are online comments’ sections over-run by the anti-science, pro-pollution crowd?

  9. 9
    Eli Rabett says:

    For more on peer review of Feulner, allow Eli to blog whore a bit. After all you gazumphed him.

  10. 10
    dz alexander says:

    I found it facinating to read this, which includes comments by four anonymous referees & responses by Legras

  11. 11
    One Anonymous Bloke says:

    No-one is infallible, but we are experts in seeing pattern where there is merely coincidence. Some patterns make very powerful illusions that are persuasive to the untrained eye, but mundane once you are familiar with them – brocken spectres spring to mind. This emphasizes the importance not just of peer-review, but of inter-disciplinary peer review.

  12. 12
    tamino says:

    I’ve posted a closer look at the Legras et al. paper.

  13. 13
    John Edmondson says:

    I tried this question over at

    What would the GCM output be if?

    The date was 50,000,000 years ago and the sun was 2% less bright i.e. the TSI is 1340 w/sqm rather than the 1366 today.

    Assume all other variables don’t change.

    Does anybody know if it possible to run the GCMs in this fashion?

  14. 14
    Phil Scadden says:

    #13 Yes, it is. Paleoclimate literature has many papers doing this kind of thing (though things were so different 50myr ago, why would keep other variables the same?). Chapter 6 of IPCC WG1, “Methods -paleoclimate Modelling” and “Pre-Quaternary Climates” sections is probably a good place to start, including a section on PETM which appears to be your interest.

  15. 15


    The flux density absorbed by the climate system is

    F = (S / 4) (1 – A)

    where S is the solar constant and A the Earth’s bolometric Russell-Bond spherical albedo. The present S = 1366.1 W m-2 and A = 0.306, so F comes out at 237 watts per square meter.

    To find the Earth’s radiative equilibrium temperature, we invert the Stefan-Boltzmann law:

    Te = (F / σ)1/4

    For the SI, the Stefan-Boltzmann constant σ is 5.6704 x 10-8 W m-2 K-4, so Te comes out at 254.3 K.

    Earth’s mean global annual surface temperature, M-GAST or Ts, is 288.2 K, 33.9 K higher than Te due to the greenhouse effect. As a very crude approximation we might expect that, for a small change in Te, Ts will remain about 100 (288.2 / 254.3) – 100 or 13% higher.

    Let’s change S to 1,340 W m-2. We then have Te = 253.0 K, and Ts = 285.9 K. The solar change resulted in a 2.3 K drop in surface temperature. Enough to make a difference!

    Of course, 50 million years ago, the atmosphere, albedo and so on were probably substantially different.

  16. 16
    Donald Oats says:

    And wrt #13 John, some amount of continental drift may need accounting for, when setting up the boundary conditions for the GCM 50Myr ago. The water and land locations were significantly different, as were the orbital dynamics – see Laskar et al (2010, etc) or Jack Wisdom (Google is your friend here) for the orbital parameters and other factors.

    Donald Oats

  17. 17
    David B. Benson says:

    Generally useful quotation pertaining to the topic of this thread:

  18. 18
    Dan H. says:

    One of my favorite quotations. I know of many instances were it has been invoked, and unfortunately, sometimes I was the one removing all doubt.

  19. 19
    Chris Dudley says:

    This is a question for Gavin I think. I’m trying to understand fig. 30 in Hansen’s new book ‘Storms of My Grandchildren’ There Hansen claims that it is based on data from “Efficacy of Climate Forcings” from 2005. In that paper, there is a similar figure 25. I’ve figured out the difference in vertical scale between the two and I realize that the apparently low sensitivity in the Storms plot is owing to using 100 year response rather than equilibrium response. But, there are two points in the Storms plot that do not appear in the Efficacy plot. One is for a carbon dioxide forcing of about 18 W/m^2 which I presume is from a 16 times carbon dioxide run. The other is for about the same solar forcing. I can’t find those data described in tables 1 or 3 of the Efficacy paper preprint at GISS or in the data archive there. Do you know if the output from those runs is validated and available? I am particularly interested in precipitation plots. Thanks.

    [Response: Sorry – I only have the paper and online output. Try emailing Jim. – gavin]

  20. 20
    Ron Manley says:

    First of all, it’s a bit disappointing that, at the time of writing, 177 people had commented on the E&E slurs but only 19 in the important topic of data.

    The quotation “sometimes there are papers published that do not appreciate the importance of meta-data (information about the data)” makes an important point. No amount of reworking of the temperature record will achieve much until meta-data are available on the thousands of stations used in the analysis.

    It may be a bit pedantic but it is wrong to assume a constant albedo. The ERBE experiment showed significant seasonal influences.

  21. 21
    tamino says:

    Re: #20 (Ron Manley)

    The reason data inhomogeneity was such a crippler for the LeMouel paper, is that they only used 3 stations. If they had a much larger sample, or had made the best use of metadata, inhomogeneities would still be a problem but not an insurmountable one.

    Also, many inhomogeneities can be detected (and either compensated for, or eliminated) when there’s enough data to compare nearby stations for consistency. In fact, for the Bologna data (used in LeMouel et al.) you don’t even *need* metatdata or nearby stations to identify serious inhomogeneity.

    What’s really pathetic is that when climate scientists do have metadata (for USHCN stations), and actually use it to improve data quality, denialists accuse them of fraud for having “manipulated” the data. Truly — it’s pathetic.

    Bottom line: your comment about the temperature record is naive at best, at worst … ?

    As for your disappointment, I personally have blogged not just once, but twice, on the data (and analysis) issues raised in this post.

    And yes, you’re pedantic.

  22. 22
    Hank Roberts says:

    > for the Bologna data (used in LeMouel et al.) you don’t even *need*
    > metatdata or nearby stations to identify serious inhomogeneity.

    It’ll be interesting to see what comes from the Berkeley EST project. They say they’re starting with raw data from more stations, but haven’t identified their data sources or made their data collection public. They should.

    Good thing it wasn’t organized out of Petaluma, or Vallejo, or Mokelumne.

    Although they should’ve considered Redding.

  23. 23
    RW says:

    Interesting thread. I have a question about some frequently referenced data:

    I’m wondering if someone can shed some light on this subject for me. I’ve searched around at length all over and cannot find a clear answer. The 3.7 W/m^2 estimated from simulations for the increase in ‘radiative forcing’ from a doubling of atmospheric CO2 – does the 3.7 W/m^2 represent a reduction in the atmospheric window or does it represent the half directed down due to isotropic re-radiation/redistribution (meaning a reduction in the atmospheric window of 7.4 W/m^2)???

    [Response: It is the global mean change in outgoing LW flux at the tropopause (integrated over the whole spectrum) for a doubling of CO2. – gavin]

  24. 24
    Jim Eaton says:


    Hey, I was born and raised in Vallejo!

    On second thought, you are right.


  25. 25
    Chris Dudley says:

    RW #23,

    There is a drawing in fig. 2 of Hansen et al. 2005 that goes through a number of definitions of forcings.

    This is a change in flux in response to an instantaneous change in carbon dioxide but after the stratosphere has responded to that change by cooling (panel b in fig. 2) assuming that the system was in equilibrium prior to the doubling of carbon dioxide.

    I think maybe your idea of closing a window has all the effect of carbon dioxide in a single layer so the redistribution idea would not really apply.

  26. 26
    RW says:


    How is the 3.7 W/m^2 derived? Is it the incremental absorption or not? If it is then how is it determined that all it will affect the surface?

  27. 27
    Hank Roberts says:

    RW, are you looking at Fig. 2 as Chris suggested?

    Do you see it? You have to open (maybe download) the PDF file.

    It’s the second picture, top of p.D18104

    “Figure 2. Cartoon comparing (a) Fi, instantaneous forcing, (b) Fa, adjusted forcing, which allows stratospheric temperature to adjust, (c) Fg, fixed Tg forcing, which allows atmospheric temperature to adjust, (d) Fs, fixed SST forcing, which allows atmospheric temperature and land temperature to adjust, and (e) DTs, global surface air temperature calculated by the climate model in response to the climate forcing agent.”

    Just below it in text:

    “The cartoons in Figure 2 compare alternative forcing definitions. We calculate Fs for all forcings and Fi and Fa for cases in which they are readily computed. We suggest that Fs has a good physical basis, because the time constant for the surface soil temperature to adjust usually is short, more like the time constant for the troposphere than the time constant for the ocean. Nevertheless, each of the forcing definitions needs to be judged on its practical utility for climate change analyses, and computation of several of them may aid understanding of climate forcing mechanisms.”

    Don’t rely on my cut’n’paste, do look at the actual document Fig. 2.

  28. 28
    Hank Roberts says:

    RW, are you getting the “incremental absorbtion” words from something you read?“climate+forcing”+”incremental+absorbtion”
    finds only a couple of mentions of it by people with, well, odd ideas:
    One by “co2isnotevil” December 7, 2010 at Judith Curry’s
    One in a skepticalscience thread referring to a Republican PR document

    Where did you come across the idea?

  29. 29
    Hank Roberts says:

    RW1 at skepticalscience:
    discussed there in several threads.

  30. 30
    Daniel Bailey says:

    @ Hank:

    RW has been asking much the same questions over at Skeptical Science for some time now (several threads covering hundreds of responses). Not liking the answers he’s received, he has come here.

    He’s a disciple of George White (“co2isnotevil”).

    The Yooper

  31. 31
    RW says:

    Hank (RE: 28),

    Why does it matter? Gavin didn’t answer my question. He answered a different question – one I already know the answer to.

    RE: 27,

    No the figure doesn’t tell me anything.

  32. 32
    Hank Roberts says:

    RW, you asked:

    > does the 3.7 W/m^2 represent
    > a reduction in the atmospheric window or
    > the half directed down …
    > (meaning a reduction in the atmospheric window …)???

    No, it doesn’t.

  33. 33
    chris colose says:

    RW– Actually gavin did answer your question in 23. If you meant something else then you need to phrase your questions better. What greenhouse gases do is reduce the outgoing longwave radiation (at fixed T) and the radiative forcing is a measure of that net irradiance change, in this case defined at the tropopause.

    The value of “3.7 W/m2” is not some theoretical constraint imposed on a model. It is also not some universal constant for doubling CO2, but rather is unique to the modern atmosphere and can change with different overlap with other gases, or even at different CO2 concentration regimes (the logarithmic forcing breaks down for example at very low or very high concentrations). The behavior in large part manifests itself due to the shape of the CO2 absorption spectrum, as discussed in raypierre’s climate book. In any case, for many models (including NASA’s ModelE), the radiative calculations are done explicitly by accounting for the temperature distribution and absorber amount that is encountered at each grid box. Hansen et al (1988, JGR, see appendix B) derived the relationship of

    dTo=f(x)-f(xo) where f(x)=ln(1+1.2x+0.005x2+.0000014x3), where x is the CO2 concentration, so a doubling of CO2 would be, for example, f(500)-f(250) with the units given in temperature already, but of course neglecting complex feedbacks that you need GCM’s for. A CO2 concentration at 300 ppm would then be 6.7 C, which is 6.7/33=20% of the greenhouse effect. There have been other studies using line-by-line radiative transfer models (see e.g., Myhre et al 1998) that come to the answer you give, which gives a simple approximation as ~5.35 ln(C/Co), where the units are now in W/m2 and C and Co are the final and initial CO2 concentrations. It should be noted that models do fluctuate somewhat about this number. A sample of the GCMs used in the AR4 show an all-sky radiative forcing between 3.39 and 4.06 W/m2 when doubling CO2 from the pre-industrial state (see Table 2 in Forster and Taylor, 2006)

  34. 34
    Hank Roberts says:

    P.S., “atmospheric window” could have a lot of meanings, depending on where you read it.

    Look at the ongoing attempts to sort it out here:

  35. 35
    Hank Roberts says:

    P.P.S., an example of casual use:

    “Global warming refers to the “enhancement” of the greenhouse effect – closing the “atmospheric window” …. more on Global Warming next semester”

    from which has a nice clickable tool showing various greenhouse gases and how they add up, here:

  36. 36
    RW says:

    Hank (RE: 32),

    OK, then what is the reduction in the atmospheric window from a doubling of CO2?

  37. 37
    Hank Roberts says:

    RW: unanswerable as posed. You’ve had that explained elsewhere at length by various people sincerely interested in trying to help you understand this.

    You can find a lot looking up your question in Scholar:

    The change is nonlinear:

    “Atmospheric window” isn’t one single thing you can do arithmetic on; individual papers tell you what the words mean for that particular purpose.

    Here’s an example — Rod will like this one:

    Design strategies to minimize the radiative efficiency of global warming molecules

    “… an atmospheric window between about 800 and 1,400 cm-1 through which blackbody radiation emitted by the Earth is lost and a temperature balance is achieved. There is no unique definition of the ‘atmospheric window,’ ….”

    I recommend this:
    How To Ask Questions The Smart Way
    (ESR doesn’t take his own advice on climate, but it’s still good advice)

  38. 38


    The 3.7 figure is from the Myhre et al. relation:

    RF = 5.35 ln (C / Co)

    where RF is radiative forcing in watts per square meter, C is CO2 concentration in ppmv, and Co reference concentration level, usually taken as the preindustrial average of 280 ppmv. You can see that doubling leads to 3.7 W m-2. For details of how the relation was derived, check the original article:

    Myhre, G., E.J. Highwood, K.P. Shine, and F. Stordal 1998. “New estimates of radiative forcing due to well mixed greenhouse gases.” Geophys. Res. Lett. 25, 2715-2718. You can read the abstract at:

    This was actually a revision downward from an earlier estimate with a proportionality constant of 6.333.

  39. 39
    Chris Dudley says:

    RW #36,

    You want to think about a spectral window closing. This is only useful with regard to a whole atmosphere number like forcing in the linear part of what is sometimes called the curve of growth. Under current conditions we are in the logarithmic portion of the curve of growth for the most part. Rather than closing a window, you would do better to think of raising the ceiling. The altitude from which the lapse rate is suspended is increased with more carbon dioxide because the atmosphere is optically thick where there is carbon dioxide opacity up to a higher altitude.

  40. 40
    Chris Dudley says:


    I realize I’ve not quite got to the heart of your question. Back to raising the ceiling. Remember that forcing is a calculationally convenient number. It imagines an instantaneous change in the composition of the atmosphere with no change in temperature. I asked you to imagine the lapse rate being suspended from some altitude which depends on the composition of the atmosphere. More carbon dioxide means starting from a higher altitude. Now, here is my mistake. Once radiative equilibrium is reestablished, this is a very helpful picture because we have just shifted the altitude higher from which the earth radiates but have kept the same temperature which means the surface must be warmer because it is connected by the lapse rate. I used the word ‘suspend’ with that picture in mind. And, it is quite physical.

    But forcing is the lack of equilibrium. So, rather than suspend the lapse rate, we need to nail it to the ground at a fixed temperature and fix it throughout the atmosphere. We then raise the ceiling (by adding carbon dioxide) and take note that the level in the atmosphere from which the earth radiates to space in now higher up and thus colder (on account of the fixed lapse rate). That means that it radiates less. The difference is the forcing.

    Of course, there is some dramatic window shutter closing going on high up (linear part of the curve of growth). But what really matters is that the window shutter is colder than where it used to be at lower altitude because the shutter is the part that radiates. And colder objects radiate less.

    Location, location, location.

  41. 41
    Shirley J. Pulawski says:

    Off-topic, but still relevant to getting fooled, does anyone remember if the group now formally known as Anonymous Hackers were also the anonymous hackers who took the stolen HadCRU emails to WikiLeaks? Now that Anon is going after the Koch Brothers, it all seems very strange. I’m reminding people they could turn on everyone at any time, but realized I need to know with absolute certainty that they were behind it. Since the emails were filtered through WikiLeaks, to me, it seems likely, but that hack is so diametrically opposed to attacks on Koch Brothers sites, which also rather goes against their self-proclaimed Libertarian philosophies.

    [Response: There is no known link between wikileaks, anonymous and the hacking of the CRU emails. – gavin]

  42. 42
    Shirley J. Pulawski says:

    “Response: There is no known link between wikileaks, anonymous and the hacking of the CRU emails. – gavin”

    Well, I guess I’ve been duped by Assange, because he has, in several interviews, claimed to be behind the initial release. When I checked a few months ago, the WikiLeaks official page also claimed responsibility. How easy it is to let the details get blurred. We didn’t know about Anon or WL then – I assumed it was their early days and the names didn’t make it out as strongly then.

    [Response: I cannot vouch for any other statements by Assange, but the release of the CRU emails was on Nov 17, 2009, and more widely on Nov 19. 2009 via a Russian ftp site. The files did not appear on Wikileaks until at least Nov 21. I have no information about whether the original hackers have anything to do with Anonymous. – gavin]

  43. 43
    Jerry Steffens says:

    This seems to be the pattern of one thread here:

    Questioner: “Why do cows bark?”

    Answer: “Cows don’t bark, they moo.”

    Questioner: “You didn’t answer the question.”

  44. 44
    Chris R says:

    #43 Jerry Steffens,

    You’ve hit the nail on the head.