Mountains and molehills

11 Nov 2008 by Gavin

Translations: (

)

As many people will have read there was a glitch in the surface temperature record reporting for October. For many Russian stations (and some others), September temperatures were apparently copied over into October, giving an erroneous positive anomaly. The error appears to have been made somewhere between the reporting by the National Weather Services and NOAA’s collation of the GHCN database. GISS, which produces one of the more visible analyses of this raw data, processed the input data as normal and ended up with an October anomaly that was too high. That analysis has now been pulled (in under 24 hours) while they await a correction of input data from NOAA (Update: now (partially) completed).

There were 90 stations for which October numbers equalled September numbers in the corrupted GHCN file for 2008 (out of 908). This compares with an average of about 16 stations each year in the last decade (some earlier years have bigger counts, but none as big as this month, and are much less as a percentage of stations). These other cases seem to be mostly legitimate tropical stations where there isn’t much of a seasonal cycle. That makes it a little tricky to automatically scan for this problem, but putting in a check for the total number or percentage is probably sensible going forward.

It’s clearly true that the more eyes there are looking, the faster errors get noticed and fixed. The cottage industry that has sprung up to examine the daily sea ice numbers or the monthly analyses of surface and satellite temperatures, has certainly increased the number of eyes and that is generally for the good. Whether it’s a discovery of an odd shift in the annual cycle in the UAH MSU-LT data, or this flub in the GHCN data, or the USHCN/GHCN merge issue last year, the extra attention has led to improvements in many products. Nothing of any consequence has changed in terms of our understanding of climate change, but a few more i’s have been dotted and t’s crossed.

But unlike in other fields of citizen-science (astronomy or phenology spring to mind), the motivation for the temperature observers is heavily weighted towards wanting to find something wrong. As we discussed last year, there is a strong yearning among some to want to wake up tomorrow and find that the globe hasn’t been warming, that the sea ice hasn’t melted, that the glaciers have not receded and that indeed, CO₂ is not a greenhouse gas. Thus when mistakes occur (and with science being a human endeavour, they always will) the exuberance of the response can be breathtaking – and quite telling.

A few examples from the comments at Watt’s blog will suffice to give you a flavour of the conspiratorial thinking: “I believe they had two sets of data: One would be released if Republicans won, and another if Democrats won.”, “could this be a sneaky way to set up the BO presidency with an urgent need to regulate CO2?”, “There are a great many of us who will under no circumstance allow the oppression of government rule to pervade over our freedom—-PERIOD!!!!!!” (exclamation marks reduced enormously), “these people are blinded by their own bias”, “this sort of scientific fraud”, “Climate science on the warmer side has degenerated to competitive lying”, etc… (To be fair, there were people who made sensible comments as well).

The amount of simply made up stuff is also impressive – the GISS press release declaring the October the ‘warmest ever’? Imaginary (GISS only puts out press releases on the temperature analysis at the end of the year). The headlines trumpeting this result? Non-existent. One clearly sees the relief that finally the grand conspiracy has been rumbled, that the mainstream media will get it’s comeuppance, and that surely now, the powers that be will listen to those voices that had been crying in the wilderness.

Alas! none of this will come to pass. In this case, someone’s programming error will be fixed and nothing will change except for the reporting of a single month’s anomaly. No heads will roll, no congressional investigations will be launched, no politicians (with one possible exception) will take note. This will undoubtedly be disappointing to many, but they should comfort themselves with the thought that the chances of this error happening again has now been diminished. Which is good, right?

In contrast to this molehill, there is an excellent story about how the scientific community really deals with serious mismatches between theory, models and data. That piece concerns the ‘ocean cooling’ story that was all the rage a year or two ago. An initial analysis of a new data source (the Argo float network) had revealed a dramatic short term cooling of the oceans over only 3 years. The problem was that this didn’t match the sea level data, nor theoretical expectations. Nonetheless, the paper was published (somewhat undermining claims that the peer-review system is irretrievably biased) to great acclaim in sections of the blogosphere, and to more muted puzzlement elsewhere. With the community’s attention focused on this issue, it wasn’t however long before problems turned up in the Argo floats themselves, but also in some of the other measurement devices – particularly XBTs. It took a couple of years for these things to fully work themselves out, but the most recent analyses show far fewer of the artifacts that had plagued the ocean heat content analyses in the past. A classic example in fact, of science moving forward on the back of apparent mismatches. Unfortunately, the resolution ended up favoring the models over the initial data reports, and so the whole story is horribly disappointing to some.

Which brings me to my last point, the role of models. It is clear that many of the temperature watchers are doing so in order to show that the IPCC-class models are wrong in their projections. However, the direct approach of downloading those models, running them and looking for flaws is clearly either too onerous or too boring. Even downloading the output (from here or here) is eschewed in favour of firing off Freedom of Information Act requests for data already publicly available – very odd. For another example, despite a few comments about the lack of sufficient comments in the GISS ModelE code (a complaint I also often make), I am unaware of anyone actually independently finding any errors in the publicly available Feb 2004 version (and I know there are a few). Instead, the anti-model crowd focuses on the minor issues that crop up every now and again in real-time data processing hoping that, by proxy, they’ll find a problem with the models.

I say good luck to them. They’ll need it.

About Gavin

815 Responses to "Mountains and molehills"

Rod B says

17 Nov 2008 at 4:52 PM

Mark, For the record I have never claimed that “AGW does not exist”. Though that seems to exclude me from your definition of a skeptic.

Tracking this debate on two threads is getting confusing to me. And probably annoying to everyone else. :-)
Alastair McDonald says

17 Nov 2008 at 6:50 PM

Well Gavin,

Whether it is opposing my new Kuhnian “paradigm” for climate change or just defending a Feyerabend “old theory” I think you are on the wrong side.

Cheers, Alastair.
Pat Neuman says

17 Nov 2008 at 7:31 PM

In #292 Harold Brooks wrote: “Why should there be any heads rolling at NWS? So far, no one has even mentioned anything involving NWS in this”.

But in the comment of #48 Gavin wrote: “GISTEMP is an analysis of data collated by the NWS and NOAA, and it is not independent of their efforts”.

Furthermore, NWS continues to provide the media with misleading discussion of 30 year averages (sometimes called “normals”).

For years, NWS Meteorologists in Charge (MIC) downplayed global warming while other NWS staff were muzzled from showing trends in temperatures at NOAA NWS climate stations, snowmelt runoff and rainfall intensity.

Heads should roll at NOAA/NWS for failing the public on climate change.
Hank Roberts says

17 Nov 2008 at 8:21 PM

Pat, remember — the election happened. The Inauguration is ten weeks away. After the Administration changes (particularly when changing from Republican to Democratic or vice versa) the political appointees _all_ roll over.

At that point the career civil service people are going to be in a different environment.

I know you aren’t sympathetic particularly with people who haven’t stuck their heads up from the civil service side during the past 8 years or so. But it can be a bad position to be in, the past Administration has really been gutting the career civil service* — and they may do much better now.

Remember this? Don’t read it just for what it says on the surface, it’s old Admin PR –read it for what it was saying about successfully putting political pressure on the career civil servants to toe the line for the political side, read between the lines.

http://www.publicaffairs.noaa.gov/releases2005/apr05/noaa05-r232.html

THAT is changing.
Ray Ladbury says

17 Nov 2008 at 8:44 PM

Magnus, OK, let me see if I can tackle that Mobius Strip of logic: Hmm, if an aspect of a model is absolutely essential to understanding some phenomenon, then that must prove the modeler is biased.

Ow, Ow, Ow… Boy, it hurts to think that way
David says

17 Nov 2008 at 9:25 PM

Gavin, from the wikipedia entry on Feyerabend :

“Feyerabend described science as being essentially anarchistic, obsessed with its own mythology, and as making claims to truth well beyond its actual capacity. He was especially indignant about the condescending attitudes of many scientists towards alternative traditions. For example, he thought that negative opinions about astrology and the effectivity of rain dances were not justified by scientific research, and dismissed the predominantly negative attitudes of scientists towards such phenomena as elitist or racist”

If accurate, this does not inspire me with confidence in his grasp of the scientific method.
MarkusR says

17 Nov 2008 at 10:28 PM

Anyone know if there is a correlation between record melting of artic ice this summer, while Scandinavia was unusually cold (based on my mom)? All that melting ice would absorb tremendous amounts of heat in that part of the world.
Hank Roberts says

17 Nov 2008 at 11:20 PM

Note _when_ he wrote that to understand what he was talking about.
You wouldn’t like to be limited to work consistent with old theories; Kuhn said much the same. Look at the “sound science” movement for contemporary examples of that kind of thinking — arguing that credible science requires, well, all the sorts of proof the tobacco lawyers wanted to see before anything could be changed. Google….
Leftymartin says

18 Nov 2008 at 12:01 AM

David (305) – David, come on now, you are missing the point here. Feyerabend rejected the rigidity of the traditional notion of the scientific method, wherein theories should be tested, and judged against, real world data. If you really want to understand Feyerabend’s significance to science (and I argue his views were very significant, and influential in terms of climate science), then head back to that Wikipedia page and have a look at his views on L’affaire de Galileo. Kind of puts the whole climate change controversy in a whole new (and most illuminating) light.
jcbmack says

18 Nov 2008 at 12:56 AM

Denialists if you guys read the NASA reports and papers,here and on GOOGLE, you will see that global cooling, localized cooling trends, warming, and the other data collected that you think is denied by the modellers is all there in black and white; yet we still have a net global warming effect; checking again as well, parts of the artic are still warming western) while precipitation assists ice thickening in other parts. The data is dealt with as it comes, not as one wishes it to be; read SCI AM article, Gavin and others have more conservative views than Hansen based upon their looking at the data, but the fact remains warming is not a good thing indefinitely; who can go back a hundred thousand years and know all about the climate? No one! Proxy data is well, approximate at best at times.
Nylo says

18 Nov 2008 at 5:39 AM

The fact that no model exists that can explain the warming in the 20th century without carbon emissions could prove something ONLY if the models could actually explain the medieval warming or the little ice age. But because none of the current models would be able to reproduce these long climatic events, there is reasonalbe doubt that we could be facing another one of them, maybe smaller, for the same unknown reasons that happened then. Nobody serious enough doubts that the CO2 issue has some influence in the present warming, but historical records allow for a reasonable doubt as to whether it must be the only significant influence as claimed by the IPCC, and therefore, whether part of the warming may revert.

[Response: What makes you think they can’t? volcanic+solar (and maybe land use) give changes over the last 1000 years that are in line with reconstructions (within the error bars of the forcings/sensitivity and reconstructions) – the problem is that the signal is not very large (a few tenths deg C) compared to the uncertainty. – gavin]
Ray Ladbury says

18 Nov 2008 at 5:57 AM

David #305: I think that was Gavin’s point. Feyerabend was one of those “sociologists of science” who felt you shouldn’t ask the scientists how or why they did things since they were biased anyway.
Maurizio Morabito says

18 Nov 2008 at 6:54 AM

Gavin – you replied to #164 “More words are devoted to much more trivial topics (Paris Hilton anyone?) and I am not responsible for what people want to comment on”

Sometimes I do think that if I would have been a prosecutor, I would have loved to get somebody arguing in your style, as a defence witness 8-)

Why? Because your words claim one thing but your actions show something else. You say it’s all been a “molehill” but manage to make this whole episode into a 1,000-word-plus mountain.

All you’d have had to do was post a blog saying “Sorry. Not exactly our fault. But thanks”. THAT is what a true molehill would have deserved.

Because if the “October faulty data” were a molehill, then the remaining 1,110 words in your blog were a waste of yours and everybody else’s time. And still you are monitoring this blog…check your reply to #299.

Methinks your behavior shows that this particular issue is no molehill. It’d make you look very good were you to admit it, but then that may be overstretching my imagination way too much.

ps people that write at length about Paris Hilton don’t consider her a molehill

pps so what is the monetary cost? Quite more than a beer, considering how long this page has become.
dhogaza says

18 Nov 2008 at 7:19 AM

After the Administration changes (particularly when changing from Republican to Democratic or vice versa) the political appointees _all_ roll over.

At that point the career civil service people are going to be in a different environment.

Except for those political appointees that the executive rolls over into career civil services positions, making it difficult to remove them.
Kevin McKinney says

18 Nov 2008 at 8:12 AM

Re #310. Nylo wrote:

“. . . historical records allow for a reasonable doubt as to whether it must be the only significant influence as claimed by the IPCC. . .”

Apologies if you already knew this, Nylo (I’m not sure what you’d characterize as “significant,”) but the IPCC AR4 policymaker’s summary says: “Changes in solar irradiance since 1750 are estimated to cause a radiative forcing of +0.12 [+0.06 to +0.30] W m–2. . .”

So the IPCC considered solar forcing–to take just one example of non-CO2-related forcings–significant enough, at least, to quantify as closely as possible.

With regard to the Medieval Warm Period, there seems to be significant (that word again!) reason to suspect that it was a regional, not global, phenomenon (see for instance, “The Discovery of Global Warming.”) This highlights the fact that we don’t just have the problem of *reproducing* the MWP with models, but a real problem of characterizing it correctly from rather limited data. All that paleoclimate research with proxies is tough stuff. Worth doing, but still tough.
tamino says

18 Nov 2008 at 9:04 AM

Re: #313 (Maurizio Morabito)

Have you nothing better to do with your time than to go on and on … and on and on … and on and on about how this post is over 1000 words long? What a pathetic attempt to extent your 15 seconds of “ha-ha-you-made-a-mistake”!

The most enlightening aspect of this incident is that it shows us all just how shallow and petty is the mindset of denialists.
Ray Ladbury says

18 Nov 2008 at 9:04 AM

Gavin and Lefty #308, I’m afraid I disagree on Feyerabend. In my opinion, he was more interested in generating controversy than understanding science. While it is true that methodology varies in different branches of science, Bacon and Galileo would probably still recognize what is being done as science. Most of his interpretations of scientific methodology and practice represent a deep misunderstanding of how scientists actually do science, and some of his ideas were downright loony. I don’t have much use for sociologists and historians of science who think they can understand how science works without talking to those who actually do science. His is a philosophy of science we could have done without.
Kevin McKinney says

18 Nov 2008 at 9:30 AM

I note that the NCDC October update is still unavailable (the site is still showing it as “unavailable till the 17th.”) Is this related to the reevaluation efforts we have been hearing about?

One of the takeaways from this whole affair for me is just how little I understand the roles and interrelationships of the various agencies and organizations involved in climate reporting.
Barton Paul Levenson says

18 Nov 2008 at 9:44 AM

I just talked to Paris, and she asked me to say, “Please tell Mr. Schmidt that I am NOT a trivial topic! Remember, I ran for President of the United States this year, and I think my energy plan was pretty [darned] good.”
Rod B says

18 Nov 2008 at 9:57 AM

Kevin (315):

“…This highlights the fact that we don’t just have the problem of *reproducing* the MWP with models, but a real problem of characterizing it correctly from rather limited data. All that paleoclimate research with proxies is tough stuff….”

Just an aside comment: but often it is taken as unassailably golden when used in support of current AGW.
Ben Lankamp says

18 Nov 2008 at 10:06 AM

The Hadley Centre (HadCRU) numbers are available. Their October 2008 anomaly is +0.38 (1971-2000 base period). For comparison, GISS has +0.41 for the same base period. This means last month sits at place 6 in the top 10 of warmest October months since 1850, or place 6 since 1880 for GISS. The NCDC analysis is also available but gives a much higher number than HadCRU and GISS (+0.47). They put October 2008 at a number 2 position in their top 10.
Ben Lankamp says

18 Nov 2008 at 10:28 AM

The Hadley Centre (HadCRU) numbers for October are available. The Hadley analysis puts October 2008 at +0,38 (base period 1971-2000) where GISS has +0,41. I could be wrong but the difference between the two appears to be within the margin of error. Last month gets a #6 position in the top 10 of warmest October months, according to HadCRU. GISS puts October 2008 at #5. The NCDC analysis is available as well, but theirs looks a tad warm with a whopping +0,47 and a #2 position only losing to 2003.
Kevin McKinney says

18 Nov 2008 at 11:03 AM

Re 320:

“Just an aside comment: but often it is taken as unassailably golden when used in support of current AGW.”

A matter of perception, perhaps, relatively how often this is done on each side of the debate? Both sides have, shall we say, less-informed and more-informed advocates. Readers here will not characterize paleo data as “unassailably golden,” though they may well feel (as I do) that the preponderance of paleo-climatic evidence favors the AGW consensus. (Especially after that last Mann paper.)

On the other hand, “unassailable” seems to be a good word to characterize the denialist perception of the MWP. (The skeptic perception would not fall under this category.)
Kevin McKinney says

18 Nov 2008 at 11:05 AM

322: Thanks for the info, Ben.
Pat Neuman says

18 Nov 2008 at 11:54 AM

Harold, you said (#304) that I should read between the lines. Then you
posted a link to a NOAA Press Release on NOAA Deputy Director Jack Kelly
having received the David O. Cooke Leadership in Federal Service Award.

While director of NWS Jack Kelly approved suspensions issued to me in 2000
because I tried to inform the public about hydrologic change due to climate
change happening in the Upper Midwest. At that time Clinton and Gore were
in the Whitehouse and I was a Senior Hydrologist at the NWS North Central River
Forecast Center in Chanhassen, MN.

Information was being with-held from the public on global warming and climate
change well before G.W.Bush’s 1st term.

Incidentally, because I continued to talk about climate change my career
at NWS ended in 2005.

You suggest that things are changing now at NOAA. I doubt that things are
changing enough.
snorbert zangox says

18 Nov 2008 at 12:00 PM

Barton Paul Levenson,

In 179, you said.

“The fact that cosmic rays are a mechanism for cloud formation does not mean they are a major factor in Earth’s climate. The GCR flux in Earth’s atmosphere is about 5 particles per square centimeter per second, which isn’t enough to generate a serious amount of clouds.

What’s more, the GCR flux has been stable for 50 years, so it can’t have contributed to the sharp upturn in global warming of the last 30.”

How do you know that cosmic ray flux changes have not been enough to change the temperature of the climate? Where have we spent the billions of dollars in research funds to pursue the question? Who has investigated the sunspot/temperature thoroughly?

By the way, what sharp upturn in global warming are you talking about? I cannot see a significant difference between the Earth’s temperature in the past 30 years. I see a rise of 0.2 to 03 degrees.

Let’s pretend that a Martian scientist flies in and that we show him two graphs. One graph compares the climate temperature to the concentration of carbon dioxide in the air; it has a correlation coefficient (percentage) in the low to mid 20s. The other displays a comparison of climate temperature to sunspot activity; it shows a correlation coefficient in the mid to high 80s. Which do you think he would pursue in his quest to understand the reasons for increasing climate temperatures?

Let’s go on and show him a third graph, one displaying the 2003 ice core data that show that throughout history temperature changes have preceded carbon dioxide changes by half a century. Then let’s show him a graphic that displays glacier retreat and other climate warming symptoms preceding the current upturn in carbon dioxide emissions by at least 100 years. Would those increase or decrease his interest in pursuing a carbon dioxide causation for the ongoing climate warming?
Kevin McKinney says

18 Nov 2008 at 12:17 PM

“Let’s go on and show [a hypothetical Martian scientist] a third graph, one displaying the 2003 ice core data that show that throughout history temperature changes have preceded carbon dioxide changes by half a century. Then let’s show him a graphic that displays glacier retreat and other climate warming symptoms preceding the current upturn in carbon dioxide emissions by at least 100 years. Would those increase or decrease his interest in pursuing a carbon dioxide causation for the ongoing climate warming?”

Depends if he’s astute enough to note that the CO2 emissions are a *new* forcing, as well as an historic feedback, or to note that there are other factors affecting warming than just GHGs–see the IPCC AR4 for attributions. It’s easy to access from the RealClimate sidebar link.
tamino says

18 Nov 2008 at 12:31 PM

Re: #326 (snorbert zangox)

Please tell us exactly what datasets, and what time intervals, you’ve used to get correlation of temperature and CO2 in the low-to-mid 20s, and correlation with sunspot activity in the high 80s.
Hank Roberts says

18 Nov 2008 at 1:18 PM

> let’s show … Let’s go on and show …

Show your cites.

Our Martian will thank you for offering the cherries you’ve picked out, but will ask for all the information in the journals and be skeptical of your claims.
Kevin McKinney says

18 Nov 2008 at 1:38 PM

I think it’s become clear from this thread that there is a perceptual disconnect between those (mostly denialists/skeptics) who view, or at least represent, the GISS data product as purporting to be authoritative and “graven in stone” and those who (like Gavin) see it as a work in progress, carried out under serious limitations of time and funding, and subject to ongoing correction by the scientific (broadly speaking!) community.

The first tend to see this as a mountain, since they will think that but for the due diligence of amateurs, this error would have been eternally part of the record (and possibly so by some sinister design.) The second see it as a molehill, since it was caught quickly, and would have been caught and corrected sooner or later regardless.

This leads me to wonder: is there a policy statement characterizing GISSTEMP data as it is supplied and updated over time? And if not, would it be a good idea to develop one, so that those using the data know just what they are using? If the second understanding is correct–and I am presuming it is–then perhaps an explicit statement would be helpful in educating those who see GISSTEMP (or initial GISSTEMP results) in the first manner described above.
Ray Ladbury says

18 Nov 2008 at 1:45 PM

Snorbert Zangox says: “Let’s pretend that a Martian scientist flies in and that we show him two graphs.”

It seems that your entire approach is based on pretending. Why not actually learn the science so you can at least argue against it intelligently. We know GCR fluxes aren’t changing significantly because we can and have measured them. We also don’t know of a mechanism whereby GCR fluxes at current levels significantly affect climate. “It might be…” is not science unless you have a mechanism.

Of course, you would then have to explain why the physics of the greenhouse should magically change between 280 ppmv and 380 ppmv, but since you’re all about pretending, that shouldn’t pose much of a barrier.
Leftymartin says

18 Nov 2008 at 1:56 PM

Ray Ladbury (317) – sorry Ray, I guess I was a little too obtuse in my comments on Feyerabend…in fact I agree with you for the most part. I was entirely correct in noting Feyerabend’s significant influence on, for lack of a better word, the “philosophy” of climate change science, and pointed to his views on L’affaire de Galileo as an illustration of this. Feyerabend spoke thus:

“The church at the time of Galileo was much more faithful to reason than Galileo himself, and also took into consideration the ethical and social consequences of Galileo’s doctrine. Its verdict against Galileo was rational and just, and revisionism can be legitimized solely for motives of political opportunism.”

Neatly encapsulates a lot of the attitudes one is confronted with on both sides of this debate. Does make things interesting though.
Mark says

18 Nov 2008 at 2:18 PM

Leftymartin, #332, “Neatly encapsulates a lot of the attitudes one is confronted with on both sides of this debate. Does make things interesting though”

Does it?

Equally? (since that is the inference from an unqualified statement such as the one you made)
William Astley says

18 Nov 2008 at 2:44 PM

In reply to comment #331 Ray Ladbury:” It seems that your entire approach is based on pretending. Why not actually learn the science so you can at least argue against it intelligently. We know GCR fluxes aren’t changing significantly because we can and have measured them. We also don’t know of a mechanism whereby GCR fluxes at current levels significantly affect climate. “It might be…” is not science unless you have a mechanism.”

Ray,
GCR has increased 18% in the last 3 years and has changed in the last 2 months. The solar wind strength and density is currently at its lowest level in 45 years. There is strong correlation of GCR changes and planetary cloud cover up until 1992. GCR has changed. Post 1992 there is a solar mechanism that could possibly explain the observations. The problem it is difficult to meaasure planetary albedo/cloud cover. Due to the measurement issues it is therefore difficult to quantifiably prove or disprove the solar hypothesis/mechanism. There is however a mechanism.
Hank Roberts says

18 Nov 2008 at 2:46 PM

Ray, don’t go too far with the “sound science” argument.
Remember the notion that
> “It might be…” is not science
> unless you have a mechanism.

That could come from from the tobacco lawyers’ successful attempt to stifle the epidemiologists for 20 or 30 years. E.g.,
http://pediatrics.aappublications.org/cgi/content/abstract/117/5/1745

Correlation is not causation.
But correlations are interesting — and selective quoting of correlations reveals a great deal also.

http://www.skepticalscience.com/solar-activity-sunspots-global-warming.htm covers this well.
tamino says

18 Nov 2008 at 3:06 PM

snorbert zangox has not yet seen fit to reply to my query.

I used GISS data for global temperature, Mauna Loa measurements of CO2 concentration, and international sunspot numbers, all since March 1958 (the beginning of the Mauna Loa CO2 data). The correlation of CO2 with temperature is r = 0.7936 (r^2 = 0.6299), not anywhere near the “low to mid 20s.” The correlation of sunspot count with temperature is r = 0.000603 (r^2 = 0.0000003638), not exactly in the “high 80s.”

Perhaps Mr. zangox has given us an illustration of the difference between a skeptic and a denialist.
MrPete says

18 Nov 2008 at 3:19 PM

Dhogaza, I know you’re an experienced computer professional. I hope you understand the difference between component failure and method failure better than illustrated by your response.

Let’s quickly set aside the sidetrack about “attacks”… Personally, I have no dog in any such fights about data QA. I just want to see the situation improve. I don’t know why you are being defensive about this. Gavin is not (at least not to me.) Perhaps something from my background and relationships can help improve things. My perspective is not about just GISS, but GISS as part of an entire delivery system. AFAIK, this posting and this blog is authored by people at NASA not NOAA, which is why I’ve been commenting about NASA-related work here. Interestingly, I googled the words noaa and blog — and the only NOAA-related item that appeared was a blog requiring an internal noaa account. Meanwhile, NASA has a whole domain for blogs.nasa.gov! Way to go NASA! :)

I also wouldn’t be so quick to assume that it is obvious to “anyone with even minimal experience in the area” that primary emphasis needs to be on early-management of defects. How much consulting do you do for organizations that deal with masses of data? I try hard not to make assumptions about what people know about arenas that are not their area of specialty. This thread alone ought to be enough to give pause to anyone who has ears to hear.

With respect to evidence about the quality of QA, or evidence that the kinds of checking I listed is NOT performed. First, I was simply providing a list of good things I’d do. I did not suggest that none of them are being done. Nor did I insinuate, let alone claim, “that, in essence, no data QA is done at all.” Please get off your high horse before it steps on you. Your anger and defensiveness simply make it more difficult to move forward. My goal, and my statements, are aimed at improving the situation.

Dhogaza, you ask for evidence of lack of Data QA (from an outsider with zero access to NOAA and NASA internal procedures.) And our Rabbett friend (#245) suggests that it’s all been thought of before, and provides links to published literature on the subject at both NASA and NOAA. On the surface, one might assume Mr Pete is just blowing smoke. Obviously, some real Data QA Professionals are minding the store, correct?

Why would I suspect that professional Data QA expertise might be helpful? Because both the visible real world evidence, and the literature itself (see below) lead a person who has such expertise to such an hypothesis. There are plenty of examples in the literature and in the “real world” demonstrating that all four of the basic analysis types I mentioned are problematic. And I haven’t even touched on more sophisticated statistical analyses of the data flow.

Below, I’ve taken a few minutes to go down this bunny trail with you.

First, let’s review what NOAA has to say about USHCN Rev 2, using the link graciously provided by Eli. It provides a simple summary of the data chain from their perspective. It also documents NOAA’s QA procedure:

The data chain: HCN Station Sensor Data -> QA -> Homogeneity Testing -> Various Adjustments and data infilling -> Output
Data collected: Daily Min Temp, Daily Max Temp, Daily Total Precip
QA Tests Used: “A series of quality evaluation checks”
QA Test Failure Result: Monthly data set to N/A if 30% of measurements are missing or fail the QA tests. (“no monthly temperature average or total precipitation value was calculated for station-months in which more than 9 were missing or flagged as erroneous”)

Now, let’s see what Peterson of NCDC has to say; the second link again graciously provided by Eli:

Incoming Data QA Tests Used: “A wide variety of checks have been developed to identify erroneous data points” (p 14)(***)
Post Production QA Tests:
* “Comparison with other data sets: they show the same thing” (p19)
* “Comparison of Land and Ocean: they show the same thing” (p20) [the graph shows general correlation of cool vs warm decades]
* “Comparison of Urban and Rural stations: they show the same thing (p21) [graph shows correlation to 0.05 degrees, urban vs rural]
[I’ll stop there; the bulk of the document is on homogeneity adjustments, not Data QA]

(*** I do not have access to Peterson 1998 other than the abstract. The complete Peterson 1998 abstract is here.

That, in brief, sums up the documents available. If someone has access to Peterson 1998, I’ll be glad to review it.

Eli suggests these documents should give me confidence that the data QA process is well in hand. I searched a bit more to find some full documents that say more than “a wide variety of checks” hand waving. And found a related manuscript here.

As far as I can tell, the typical Data QA testing and procedures can be summarized as follows: a) Identify outliers; b) remove or truncate them; c) perform further adjustments, including missing data processing.

Let’s compare with my very simple list:
* Outliers (way too high, too low)
* Zeros of all kinds (test the data chain: what comes out when zero is inserted at various points?)
* Missing data (again, test the data chain)
* Data propagation errors (check deltas between values in time and data-proximity… a few identical values can be natural, but not very many)

For this comparison, remember that my stated emphasis is on detecting valid sensor data vs invalid data due to process/equipment/human error. To that end, I will observe whether there is evidence that each valid vs invalid data is: Detected, Accounted for, and Corrected at the source to avoid the issue in the future.

1) Outliers: NCDC detects and accounts for, but has no apparent process to correct at the source. NOAA detects, then accounts for by deletion. They too have no apparent process to correct at the source.

2) Zeros of all kinds: there is no evidence that this is detects, accounted for or corrected.

3) Missing data: these documents do not hint at NCDC work on missing raw-source data detection/accounting/correction processes. Missing data is handled extensively later on as an adjustment. NOAA conflates outliers with missing data at the source level, and accounts for them both identically. No process is documented to correct such issues at the source.

4) Data propagation errors: there is no indication that either NCDC or NOAA have processes designed to detect, account for or correct data propagation errors.

Bottom line: I see no evidence that ANY of these are handled in such a way that the system learns to eliminate early errors. If anything, the system appears designed to allow for increasing amounts of error of unknown origin, converting errors into gaps, and then attempting to adjust for those gaps, which can consist of almost 1/3 of a data stream. And the system allows entire data streams to disappear unnoticed.

I see nary a hint that systemic data errors ought to be detected and used to make the system *more* robust. This is a recipe for a data chain that self-destructs over time, rather than improves.

I hope that brief interlude helps you see that perhaps MrPete has sufficient experience to say, at a gut level, “I think this process could use some help”.

BTW, I could take a pot shot at the value of assuming Data QA is being done when comparing highly adjusted data sets (let alone mutually derived data, or data that is visually different yet “shows the same thing”, etc etc).

But I won’t. :-D

I think we all could use some fresh air.
Pat Neuman says

18 Nov 2008 at 4:44 PM

Re: #330 (Kevin),

There should not be “serious limitations of time and funding” with GISS data.
NOAA NWS is a large and heavily funded agency (over 5,000 employees in 140 offices).
Ray Ladbury says

18 Nov 2008 at 6:40 PM

William Astley, Cite please on GCR fluxes. I’ve seen no evidence with any of my satellites–and believe me, I’d know. There is a normal modulation out of phase with the solar cycle (max during solar min–min during solar max). That’s not unusual. Is this over and above that?
Hank Roberts says

18 Nov 2008 at 7:43 PM

Comparable attention to everything NASA does might’ve saved a number of Mars programs that failed due to minor software errors. Just sayin’.

Has anyone yet compared Gavin to Tom Sawyer, busily whitewashing that fence?

Worth recalling:

http://www.levity.com/alchemy/blake_ma.html
————

Opposition is true Friendship.

——————–
Good scrutiny (that is not served up heavily larded with attitude) is more likely to be heard and applied.
Being effective often requires giving up some egoboo.
Christopher Keys says

18 Nov 2008 at 7:58 PM

I’m glad that is cleared up.
Rod B says

18 Nov 2008 at 9:39 PM

Kevin, just for the record so as to not get broad brushed tainted: As a skeptic (though Mark says not ;-) ), I early in this thread put myself in the molehill camp.

BTW, your response to me on the paleoclimate thing was good.
Rod B says

18 Nov 2008 at 9:49 PM

PS – To be complete, however, being a molehill does not make it unimportant…
Kevin McKinney says

18 Nov 2008 at 10:13 PM

Re 338:

Pat, I am just summarizing what I have seen on the thread as I understand it–in this case, based on Gavin’s response to #37, way up at the top of the thread.
Barton Paul Levenson says

19 Nov 2008 at 5:58 AM

snorbert writes:

How do you know that cosmic ray flux changes have not been enough to change the temperature of the climate?

Which part of “the trend has been flat for 50 years” did you not understand?
Tenney Naumer says

19 Nov 2008 at 8:13 AM

If anything, this post and comments have clearly shown how little the denialist fringe know and understand of the scientific process for the analysis and publishing of any data and their continued stubborn refusal to learn about such processes.

This does come with a price.

But, ya know what?

I predict that their day will be over in about 12 months or less.

[operation ear]
Rod B says

19 Nov 2008 at 10:01 AM

(340) “Opposition is true Friendship.”

There ya go!
Marcus says

19 Nov 2008 at 12:01 PM

Speaking of surface temperature records, and all the moles burrowing into surface temperature data collection over at Watts Up: have there been any preliminary comparisons made between the appropriately aggregated national-scale temperature results from the Climate Reference Network and the USHCN network?
Pat Neuman says

19 Nov 2008 at 12:24 PM

Re: #330

Quality control software is used at NWS River Forecast Centers (RFCs) for calibration of river basins based on historical precipitation and temperature data at U.S climate stations. The effort, which began in the mid 1970s, is now called the NWS “Advanced Hydrologic Prediction Service”. In the 1980s APHS had another name (then called WARFS) but it took the 1993 Great Flood and a rename to AHPS to get Congress to continue the funding. A great deal of money went into developing the the quality control software (double-mass plotting techniques, etc.). That technology should be used to evaluate historical and recent data at climate stations. Government agencies could do a lot better in evaluating historical and recent data if NWS resources were being tapped. Why haven’t those NWS resources been tapped? My answers include fear and failure by NOAA NWS in addressing climate change in NWS hydrology, weather and climate responsibilities.

AHPS:

http://www.weather.gov/ahps/about/about.php
tom says

19 Nov 2008 at 1:04 PM

“if you guys read the NASA reports and papers,here and on GOOGLE, you will see that global cooling, localized cooling trends, warming, and the other data collected that you think is denied by the modellers is all there in black and white; yet we still have a net global warming effect”

Say what?
How could we have global cooling and net warming?