Past reconstructions: problems, pitfalls and progress

7 Dec 2007 by Gavin

Many people hold the mistaken belief that reconstructions of past climate are the sole evidence for current and future climate change. They are not. However, they are very interesting and useful for all sorts of reasons: for modellers to test out theories of climate change, for geographers, archaeologists and historians to examine the impact of climate on past civilizations and ecosystems, and for everyone to get a sense of what climate is capable of doing, how fast it does it and why.

As a small part of that enterprise, the climate of the medieval period has received a very high (and sometimes disproportionate) profile in the public discourse – due in no small part to the mistaken notion that it is an important factor for the attribution of current climate change. Its existence as a period of generally warmer temperatures (at least in the Northern hemisphere) than the centuries that followed is generally accepted. But the timing, magnitude and spatial extent are much more uncertain. All previous multiproxy reconstructions indicate a Northern Hemisphere mean temperature less than current levels, though possibly on a par with the mid- 20th century. But there are only a few tenths of a degree in it, and so the description that it is likely to have been warmer now (rather than virtually certain) is used to express the level of uncertainty.

A confounding factor in discussions of this period is the unfortunate tendency of some authors to label any warm peak prior to the 15th Century as the ‘Medieval Warm Period’ in their record. This leads to vastly different periods being similarly labelled, often giving a misleading impression of coherence. For instance, in a recent paper it was defined as 1200-1425 CE well outside the ‘standard’ definition of 800-1200 CE espoused by Lamb.

Since a new ‘reconstruction’ of the last 2000 years from Craig Loehle is currently doing the rounds, we thought it might be timely to set out what the actual issues are in making such reconstructions (as opposed to the ones that are more often discussed), and how progress is being made despite the pitfalls.

The Loehle paper was published in Energy and Environment – a journal notable only for its rather dubious track record of publishing contrarian musings. The reconstruction itself is based on a network of 18 records that are purportedly local temperature proxies, and we will use those as examples in the points below. More discussion of this paper is available here (via the wayback machine).

Issue 1: Dating Nothing is more important than chronology in paleo-climate. If you can’t line up different records accurately, you simply can’t say anything about cause and effect or coherence or spatial patterns. Records where years can be accurately counted are therefore at a premium in most reconstructions. These encompass tree ring width, density and isotopes, some ice cores, corals, and varved lake sediments. The next most useful set of data are sources that have up to decadal resolution but that can still be dated relatively accurately. High- resolution ocean sediment cores can sometimes be found that fit this, as can some cave (speleothem) records and pollen records etc.

There are nonetheless more problems with the decadal data – they may have been smoothed by non-climate processes, and their dating may be off by a decade or two. But there are enough records that are widely enough dispersed to make them a useful adjunct in a reconstruction that hopes to capture decadal to multi-decadal variability.

Using data that has significantly worse resolution than that in reconstructions of recent centuries is asking for trouble. The age models tend to have errors in the 100’s of years, and the density of points rarely allows one to reach the modern instrumental period.

For instance, South-Eastern Atlantic ocean sediment data from Farmer et al (2005) (Loehle data series #17) nominally goes up to the present 0 calendar years. This is really 1950 due to the convention that years “Before Present’ (BP) almost invariably begin then (some recent papers use BP(2000) to indicate a different convention, but that is always specifically pointed out). However, the earliest real date for that core is 1053 BP, with a 2-sigma range of 1303 to 946 BP – almost 400 years! That makes this data completely unsuitable for reconstructions of the last 2000 years – which in all fairness, was certainly not the focus of the original paper.

Similar issues arises with data from DeMenocal et al (2000) (Loehle #10) and SSDP-102 (Kim et al , 2004) (Loehle #18). In the the first record, the initial data point nominally comes from 88 BP (i.e. 1862 CE), but the earliest dated sample is around 500 BP. In the second, the initial date is closer to the present (1940), but the age model is constrained by only 3 ages over the whole Holocene (and it’s not clear that any are within the last two millennia. So while both records have more apparent resolution than Farmer et al, their use in a reconstruction of recent paleo-climate is dubious.

It should probably be pointed out that the Loehle reconstruction has mistakenly shifted all three of these records forward by 50 years (due to erroneously assuming a 2000 start date for the ‘BP’ time scale). Additionally, the series used by Loehle for the Farmer et al data is not the SST reconstruction at all, but the raw Mg/Ca measurements! Loehle #12 (Calvo et al, 2002) is also off by 50 years, but since it doesn’t start until 1440 CE, its presence in this collection is surprising in any case. The dates on two other ocean sediment cores (Stott et al 2004 – #14 and #15) are on the correct scale thankfully, but are still marginal in terms of resolution (29 and 44 years respectively, but effectively longer still due to bioturbation of the sediments). Neither of them however extend beyond the mid-20th Century (end points of 1936 CE and 1810 CE) and so aren’t much use for looking at medieval-vs-modern data.

Other dating issues arise if the age model was tuned for some purpose. For longer time scale records, the dates are often tuned to ‘orbital forcing’ periods based on the understanding that precession and obliquity do have strong imprints in many records. However, in doing so, you remove the ability to assess with that record whether the orbital expression is leading or lagging another record. Since reconstructions of recent centuries are often pored over for signs of solar or volcanic forcing, it is crucial not to use those signals to adjust the age model. Unfortunately, the Mangini et al (2005) speleothem record (Loehle #9) was tuned to a reconstruction of solar activity so that the warm periods lined up with solar peaks. This invalidates its use on that age model for any useful reconstruction, since it would be assuming a relationship one would like to demonstrate. If put on a less biased age model, it could be useful however (but see issue 3 as well).

Issue 2: Fidelity

This issue revolves around what the proxy records are really recording and whether it is constant in time. This is of course a ubiquitous problem with proxies, since it well known that no ‘perfect’ proxy exists i.e. there is no real world process that is known to lead to proxy records that are only controlled by temperature and no other effect. This leads to the problem that it is unclear whether the variability due to temperature has been constant through time, or whether the confounding factors (that may be climatic or not) have changed in importance. In the case where the other factors seem to be climatic (d18O in ice cores for instance), the data can sometimes be related to some other large scale pattern – such as ENSO and could thus be an indirect measure of temperature change.

In many cases, proxies such as Mg/Ca ratios in foraminifera have laboratory and in situ calibrations that demonstrate a fidelity to temperature. However, some proxies, like d18O which do have a temperature component, also have other factors that affect them. In forams, the other factors involve changes in water mass d18O (correlated to salinity), or changes in seasonality. In terrestrial d18O records, the precipitation patterns, timing and sources are important – more so in the tropics than at high latitudes though.

A more prosaic, but still important, issue is the nature of what is being recorded. Low resolution data is often not a snapshot in time, but part of a continuous measurement. Therefore the 100 year spaced pollen reconstruction data from Viau et al (2006) (Loehle #13), are not estimates for the mid-point of each century, but are century averages. Linear interpolation between these points will give a series that actually has a different century-long means. The simplest approach is to use a continuous step function with each century given the mean, or a spline fit that preserves the average rather than the mid-point value. It’s not clear whether the low resolution series in Loehle (#4, #5, #6, #10, #13,#14, #15, #17, #18) were treated correctly (though to be fair, other reconstructions have made similar errors). It remains unclear how important this is.

Issue 3: Calibration

Correlation does not equal causation. And so a proxy with a short period calibration to temperature with no validating data cannot be fully trusted to be a temperature proxy. This arises with the Holmgren et al (1999) speleothem grey-scale data (Loehle #11) which is calibrated over a 17 year period to local temperature, but without any ‘out-of-sample’ validation. The problem in that case is exacerbated by the novelty of the proxy. (As an aside, the version used by Loehle is on an out-of-date age model (see here for an up-to-date version of the source grey-scale data – convert to temperature using T=8.66948648-G*0.0378378) and is already smoothed with a backwards running mean implying that the record should be shifted back ~20 years).

As mentioned above, there are a priori reasons to assume d18O records in terrestrial records have a temperature component. In mid-latitudes, the relationship is positive – higher d18O in precipitation in warmer conditions. This is a function of the increase in fractionation as water vapour is continually removed from the air. Most d18O records – in caves stalagmites, lake sediment or ice cores are usually interpreted this way since most of their signal is from the rain water d18O. However, only one terrestrial d18O record is used by Loehle (#9 Spannagel), and this has been given a unique negative correlation to temperature. This might be justified if the control on d18O in the calcite was from local cave temperature impact on fractionation, but the slope used (derived from a 5-point calibration) is more negative even than that. Unfortunately, no validation of this temperature record has been given.

Issue 4: Compositing

Given a series of records with different averaging periods, spatial representation and noise levels, there are a number of problems in constructing a composite. Equal averaging is simple but, for instance, implies giving equal weight to a century-mean North American continental average (Viau et al, Loehle #13) to a single decadally varying N. American point (Cronin et al, #3), despite the fact that one covers a vast area and time period and the other is much less representative. Unsurprisingly, the larger average has much less variability than the single point. To address this disparity, a common practice is to normalise the records by their standard deviation and to areally weight records – but without that, the more representative sample ends up playing a much smaller role.

Another approach, used implicitly in climate field reconstruction methods (like RegEM for instance), is to use current instrumental records to assess the relevance of any particular point, region or time period to the desired target. Another idea would be to estimate the changes in noise characteristics over larger areas and longer times and build that into the normalisation. That might also be useful for records whose resolution decays in time (the GRIP borehole temperature for instance, Loehle #1).

Finally, one needs to be very careful to deal with each series consistently. Treating an interpolated low-resolution record differently to another low-resolution record that wasn’t interpolated seems inconsistent. Keigwin’s Sargasso sea record is very low-resolution (Loehle #4) but Loehle appears to use it as though it was a real high-resolution record, while Kim et al (Loehle #18), which is equally low-res, is only used within 15 years of a datapoint.

Issue 5: Validation

It is inevitable that many seemingly ad-hoc decisions need to be made in building a particular reconstruction. This is not in itself cause for concern – the inhomogeneity of the data and its sparsity require that kind of consideration. Given that there is then no mathematically perfect way of doing this, the test of whether any particular approach is worthwhile lies in the validation i.e. does the reconstruction give a reasonable fit to the target field or index over a period or with data that wasn’t used in the calibration? There’s a good discussion of these issues in two recent papers in Climatic Change (Wahl and Amman, 2007; Amman and Wahl, 2007) in relation to the original Mann Bradley & Hughes papers. One would also like to test how sensitive the answers are to other equally sensible choices – a result can be considered robust if it is relatively insensitive to such methodological choices.

What does this imply for Loehle’s reconstruction? Unfortunately, the number of unsuitable series, errors in dating and transcription, combined with a mis-interpretation of what was being averaged, and a lack of validation, do not leave very much to discuss. Of the 18 original records, only 5 are potentially useful for comparing late 20th Century temperatures to medieval times, and they don’t have enough coverage to say anything significant about global trends. It’s not clear to me what impact fixing the various problems would be or what that would imply for the error bars, but as it stands, this reconstruction unfortunately does not add anything to the discussion.

So where does this all leave us? Since the early days of multi-proxy reconstructions a decade ago the amount of suitable data has definitely increased, and so many of the issues related to specific proxies are becoming increasingly unimportant. As the amount of data grows, the picture of climate in medieval times will likely become clearer. What seems even more likely is that the structure of the climate anomalies will start to emerge. The simple question of whether the medieval period was warm or cold is not particularly interesting – given the uncertainty in the forcings (solar and volcanic) and climate sensitivity, any conceivable temperature anomaly (which remember is being measured in tenths of a degree) is unlikely to constrain anything.

However, if the tantalising link between medieval American mega-droughts and potential long-term La Nina conditions in the Pacific can be better characterised, that could be very useful at constraining ENSO sensitivity to climate change – something of great interest to many people. That will have to wait for a better next-generation reconstruction though.

Thanks to Eric Swanson for helping find some of the more interesting choices made by Loehle in his reconstruction, and Karin Holmgren for swift responses to my queries about her data

Update (Jan 22): Loehle has issued a correction that fixes the more obvious dating and data treatment issues, but does not change the inappropriate data selection, or the calibration and validation issues.

About Gavin

75 Responses to "Past reconstructions: problems, pitfalls and progress"

Susan K says

9 Dec 2007 at 7:45 PM

Would a climate scientist here please provide a link for me showing when these climate change induced predictions were first published:

Melting arctic
Rising sealevels
More Droughts
(ie desertificated areas doubled in size from 1970s to 2003 – had anyone predicted increasing drought before the 1970s?)
http://enviro.org.au/drought.asp
More Floods
More intense Hurricanes (and surely wouldn’t that include Cyclones/Typhoons outside the US?)
New insect pests destroying forests as habitat moves North

There seems a very entrenched bunch of deniers over at the Wall Street Journal and although I try to put them right when possible, I would like some of the above facts for ammunition:
http://blogs.wsj.com/energy/2007/12/07/beat-the-beetles/
Armagh Geddon says

10 Dec 2007 at 5:15 AM

“Finally, I agree, real analyses will win out in the end. I’m hopeful that is too far off. – gavin]”

Never was a truer word said!!!

[Response: fixed. thanks -gavin]
Chuck Booth says

10 Dec 2007 at 10:01 AM

Re # 50 Susan –

Try the AIP The Discovery of Global Warming site – link in the orange box on the right side of the RC home page, or:

http://www.aip.org/history/climate/index.html
Timothy Chase says

10 Dec 2007 at 12:14 PM

Jim Galasyn (#48) wrote:

Villarreal,
Viruses and the Evolution of Life [book]

I have known about him for years, but unfortunately haven’t been able to afford the book.

Jim Galasyn (#48) wrote:

I found it to be astonishing – it looks to me like Villarreal has cracked the nut of evolution.

Here’s his site:

Center for Viral Research
http://cvr.bio.uci.edu/

He seems to have been one of the pioneers. Obviously there has been a great deal going on. For example, it would appear that our spliceosomal introns and retroviruses are both descendants of early self-splicing Type II introns found in archaea and prokaryotes, where Type II introns are themselves regarded as retroelements since they are mobile. Likewise, the catalytic core of telomerase with which the telomeres at the ends of the chromosomes becomes extended is a reverse transcriptase which appears related to the reverse transcriptase found in retroviruses. Likewise, the adaptive immune system has relics of retroelements (a LINE and SINE) which make possible lymphocyte rearrangements. Viruses appear to be taking center stage along with cells in the evolution of life.

However, one of the biggest ways in which viruses (primarily retroviruses) have contributed to the evolution of life would appear to be in the generation and dissemination of tandem repeats — which are subject to hypermutation and appear to have been subject to “indirect selection” for those regions where a higher rate of mutation might prove beneficial in dealing with a changing environment. Protein coding regions, introns and promoters. Point mutations may serve to regulate the mutability of such tandem repeats – by breaking up the tandem repeats, rendering them subject to reduced levels of hypermutation. Thus, for example, many proteins have cryptic repeats where the codons will code for the same sets of amino acids as an ordinary tandem repeat, but there will be substitutions in the individual letters of the triplet.

Anyway, feel free to email me at the address I gave above — I can put together a list of papers on a few topics in this area (viruses, repeats, phages, etc.) along with links to some of the technical articles, where possible — if you are interested. Might take me a day or so, though.

But it should be handled outside of Real Climate – as it is off-topic, as Charles Booth has indicated.
Timothy Chase says

10 Dec 2007 at 12:47 PM

Re Jim Galasyn (#48) on Villarreal

I’ve known about him for years, but I haven’t been able to purchase the book as of yet, but he seems to have been one of the pioneers. Obviously there has been a great deal going on.

However, I have been able to find a fair amount of the literature: viruses, phages, hypermutable tandem repeats (in protein coding regions, introns and promoters) which largely appear to be relics of endogenous retroviruses, indirect selection for mutability, cryptic repeats in protein coding regions, somatic rearrangements in the adaptive immune system, telomerase, etc.. Viruses appear to be taking center stage along with cells in the evolution of life.

Charles Booth is right about the fact that this is off-topic, but feel free to email me at the address I gave above — I can put together a list of papers on a few topics in this area along with links to some of the technical articles which are open access, where possible — if you are interested. Might take me a day or so, though.
Gaelan Clark says

10 Dec 2007 at 4:35 PM

Gavin, I would like to know a little more about your Loehle #12 reference, in which you state that it “doesn’t start until 1440 CE.” Furthering that thought you note that it is a surprise that Calvo 2002 is even used. In fact a quick look at the link that you have aptly provided suggests that the Calvo 2002 actually ends in 1440 CE.—–From Calvo 2002—Minimum Age: 0.510 kyr BP * Maximum Age: 8.490 kyr BP—
This minimum age of 0.510 kyr (or 510 years) subtracted from 1950 CE yields 1440 CE.
Have I missed something?

[Response: No. Paleo conventions in time often go backwards. Therefore when using ‘start’, I implied the first data point from the present (think of the first data point in depth in the core). I’m not sure that semantic arguments are really very constructive though – the main point is that this record adds nothing to estimates of medieval to current (or even LIA) temperatures. – gavin]
Martin Vermeer says

10 Dec 2007 at 6:29 PM

#49 Chuck:

Had Martin specified wild animals with fur coats, your questioning of my canine example would be fully justified

That you didn’t get that from the context means that you missed everything about my original post. So once again, con amore: when humans learned the arts of dressing in clothes and making fire, the selective advantage of having a natural warm furry coat turned into a disadvantage — growing fur costs energy. So the reverse adaptation took place. Simple, direct, obvious, no subtle mechanisms involved or needed in this context. OK?
Chuck Booth says

10 Dec 2007 at 8:28 PM

Re #50 Martin Vermeer:
I attempted to respond privately, but couldn’t locate your email address. I hope the RC moderators will endure one more post on this topic:
I interpreted “those animals having them” (# 43) to not be limited to humans. I don’t question your explanation of why humans are relatively hairless. But, I’m not up on the research on that subject, and one needs to be careful about evolutionary story telling (a major point of Gould and Lewonton). Hair is one trait that has numerous functions: It can keep animals warm in cold climates (by reducing convection across the skin), it can keep them cool in hot climates (by reflecting visible light, or absorbing visible light and IR but not conducting the heat to the animal’s skin), reduce water loss (again, by reducing convection), and protect the skin from invasion by parasites, bacteria, and viruses.
In explaining why humans have lost most of the hair, you really need to consider all likely functions (and selection pressures). That was my point about domestestic dogs: As Timothy noted, their coats are, in many cases, the result of selective breeding for esthetics rather than thermoregulation. Artificial selection in this case is analogous to sexual selection (instead of attracting mates, the dogs attract breeders bearing mates). In nature, sexual selection can result in seemingly maladaptive traits (as Darwin noted). And selection pressures for multiple traits (heat retention, heat resistance, protection against invaders) may result in evolutionary compromises – the end result may not be optimal for any one trait, but instead may be the optimal solution for several traits together.
Timothy Chase says

10 Dec 2007 at 8:57 PM

Re Jim Galasyn (#48)

As Chuck Booth says, the biology is off topic. But feel free to email me and I can give you a list of related topics, check off the ones you are interested in, and I can give you a list of papers in the primary literature, links to most — and if you are interested, pdfs where the links aren’t available. As I said, an obsession, roughly for three years.

PS Don’t have the book you mentioned. Used it is still well over a hundred. Sorry I didn’t respond sooner. Actually I did respond, but things seem to have been a little glitchy past few days.
Susan K says

10 Dec 2007 at 9:52 PM

thanks, chuck
Martin Vermeer says

11 Dec 2007 at 5:08 AM

Re #55 Chuck Booth: OK, point taken.

In nature, sexual selection can result in seemingly maladaptive traits (as Darwin noted)

Yes, it often does. And not only sexual selection. It is the perfect illustration of evolution often seeking out local rather than global optima — not a very “intelligent” process :-)

Off-topic or not, I like noticing the very broad scientific appetite of this readership.
Barton Paul Levenson says

11 Dec 2007 at 7:50 AM

[[when humans learned the arts of dressing in clothes and making fire, the selective advantage of having a natural warm furry coat turned into a disadvantage — growing fur costs energy. So the reverse adaptation took place. ]]

Wrong. Humans already lacked fur, which is why they put on the clothing. You’ve got the causality backwards.
Martin Vermeer says

11 Dec 2007 at 8:35 AM

#48:

http://cvr.bio.uci.edu/

Thanks Jim. This is fantastic reading.
Marion Delgado says

12 Dec 2007 at 6:31 AM

Vern Johnson:

The main objection, and it’s been pointed out here repeatedly, seems to be that those “raising some doubts” are simply doing that – raising doubts. Not trying to improve the actual knowledge of anything, just raising doubts like a trial attorney. When confronted with that, the doubt-raisers switch to another line of attack. They also wait a few years to recycle thoroughly debunked lines of attack. The process is simply a drive to create a fake controversy and tie down the time of scientists and science advocates.

On the other hand, you’re obviously, to me, confusing for some reason disagreement with the conclusions of those analyses which are scientific in intent and practice, with objection to them. Scientists don’t see it that way.

Until, and unless, you correct your false prior assumption that the disagreements here with flawed analysis, faulty research, or simply unlikely interpretations of data are objections to someone for their temerity in questioning some cabalistically approved research and analysis, which is garbage in, you’ll get garbage out – not understanding. The use of language by RC people is completely irrelevant. In general, when something is both completely out of the mainstream (and hence, in contradiction to the work of the majority of careful, hard-working scientists) AND has serious flaws, most people in science will at least conjecture that the two are related. Caring more about the lost sheep than the ninety and nine might be biblical, but it’s not how science tends to work.
Adrian Midgley says

12 Dec 2007 at 6:31 PM

Minimal fur let our ancestors cool off or not overheat when they ran to catch food. Clothing with a serious survival value I think got adopted as later ancestors moved away from the Rift Valley and then out of Africa. The hair on top could show a selective advantage in various ways.
Alexander Harvey says

13 Dec 2007 at 12:55 PM

There is something truly sad in all this.

I am sure that the individual researchers and their teams that gather and elusidate the raw data are most often doing their level best to try and see some light through a darkened lens.

I am sure that often they must know their individual speciality inside out. In particular where all the bodies (uncertainties and artifacts) are buried. In particular the strengths and weaknesses of their research. That is the nature of research, more understanding leads to more questions.

Given access to their results (often as opposed to their hard earned data) anyone with a PC and some spare time can combine their efforts, as if they were ingredients, to make a meal. I.E make a meal of them.

“Gisser job”, “I can do that”.

There is a pride in doing something well that seems sadly out of fashion.

I take it on faith that the chronological inaccuracies (interpretation of BP etc.) as described by Gavin have occurred. Given that, it is truly remarkable that circumspection failed to wag its tail, jump up, and lick the author in the face, to warn that something is wrong. Given that the watch dog never stirred I have to wonder how come. By this I mean that when a serious error is not salient what other errors must their be to cover it up.

What a shame!

Ho Hum!

Alexander Harvey
Alexander Harvey says

13 Dec 2007 at 2:27 PM

On Peer Review and Openness.

These are a qualification and a quality that many do and would welcome. They are also deeply intertwined.

The paper under discussion is a case in question.

If Gavin had been requested to give his opinion prior to publication one would presume that it would have made a material difference. Also given that he has acknowledge his thanks to others who have contributed materially to his critique, there was not sufficient background available with the paper to adequately review it.

Personally I am all for openness. There is no reason why, in this day and age, that a paper should not be backed up by all the decision making process and most importantly all the data both initial and throughout each stage of its manipulation.

Or is there? In the best of all worlds that might be the case but in reality it is a can of worms. Unless it was mandated that nothing could be published without a forensic chain of evidence then we will have the noblest and the best facing detailed scrutiny, and the illusive and illusory covering thier tracks.

Also publish and be damned does not bother those of a speculative disposition. With little or no relevant reputation to protect they can piss and run with little fear.

Occassionally a paper does emerge that is backed up by copious background data sufficient for one to check it for rigour. I took the time to check one of these. As it happens it has its problems or at least significant points that require clarification. Now, what should one do about this?

Personally I feel that to undermine such an open source for being “open” to extended critism from all and sundry would be curlish and counter-productive. Better to sit on one’s hands.

Perhaps a little myopic but should one be shooting a goose in clear view when there is more deserving prey in the bushes.

To be frank my stance bothers me and I wish there would be a way for independent reviewers to pool their concerns so that a considered opinion could be related to the authors.

I am sorry to say that blogs like this do not do the job. They are simply too confrontational. By this I mean, if your intention is to improve the quality and the standing of science simultaneously then rubbishing well considered but flawed work in public achieves neither.

Alexander Harvey
Alexander Harvey says

13 Dec 2007 at 2:49 PM

Gavin et al,

I should like to know about peer review in this field.

I have discussed this with workers in other disciplines and it comes done to the old anti-maxim “If it is worth doing, it is worth doing well”. By which I mean, if it is important and in your competence you agree and take great pains (it is after all your reputation on the line) or you pass. By great pains I mean perhaps a week perhaps more.

Gavin, you have obviously (and thankfully) expended a good deal of effort on this paper and so I presume would have done as much or more if it had been referred to you pre-publication.

Given that could you please give me/us a critique of what “peer review” means (in detail) at best and at worse.

I would be most grateful if you could reply.

Alexander Harvey

[Response: Good papers are easy to review. Bad papers can be extremely time-consuming. Mediocre papers are more variable – it depends how invested you are in the subject. Therefore people tend to accept assignments for good, and not for bad. There is only so much time in the day and so trying to find competent reviewers is difficult. Personally, I do maybe 2 a month and refuse an equal number (sometimes more). Good peer review can elevate a mediocre paper substantially (but that takes work). Finding the key flaw in a bad paper and documenting it’s effect is difficult – reviewers often don’t bother. But that can lead to problems too since the other reviewers or editor might not see it. What more do you want to know? -gavin]
Alexander Harvey says

13 Dec 2007 at 5:07 PM

Dear Gavin,

First thanks for your reply.

By good papers, I presume you mean clearly described ones, ? Good, as in stricking papers, when they occur, must also cause real effort.

I am surprised, (I may be being naive) that you do two a month. If that is common and there are three reviewers per paper. That would seem to imply that the publishing rate is one every six weeks for each “peer”. Surely this is the fast end of scientific output.

Personally I think this is extreme.

I hope you realise that in your sentence: “Finding the key flaw in a bad paper and documenting it’s effect is difficult – reviewers often don’t bother.”, you are begging a question that gets to the heart of the problem. Given that circumstance what happens? You respond that it is too flawed to be deserving of comment or you sigh and let it proceed.

I am not being critical of you or anyone in particular but papers that are of dubious scientific content do get published and it would seem to be a matter of great concern as to how to curtail this.

Now, I may not be able to halt climatic change or hold back the tide but if I can do anything at all to improve the contect of scientific papers I will.

I shall write more when I can but for now I would welcome any dialogue on this issue which I think goes to the root of all poorly reviewed yet published work.

Alexander Harvey
George Darroch says

13 Dec 2007 at 6:06 PM

Hi, just out of interest, is there any work that compares proxies to the instrumental temperature record for the last century? All the work I could find in an admittedly non-exhaustive search was understandably focussed on reconstructions, rather than what we already know. I’ve heard skeptics banging on about how badly proxies reflect actual climate, and would like to know how well they’ve proved their worth against instrumental data.
Ray Ladbury says

13 Dec 2007 at 6:09 PM

Alexander Harvey, While it is true you don’t wan’t too many bad papers to get through, think for a moment how the average reader will react to a bad paper–namely by saying, “This is crap,” and moving on. We are talking about experts in the field here, not amateurs. Yes, the occasional grad student might get very excited about a bad paper, but he will be corrected quickly by his adviser if his adviser is worth anything. The ultimate test of a paper is how much it is cited in future work. Peer review is just a threshold, and it’s purpose is more to make good papers better rather than reject bad papers.
Chuck Booth says

13 Dec 2007 at 6:46 PM

Re # 66 Alexander Harvey

As the RC moderators and others (such as Ray Ladbury) have pointed out repeatedly, peer review is merely a ticket that allows entrance into the marketplace of scientific ideas. Once in the marketplace, it is survival of the fittest – the best papers will be cited the most often and will gain the most credibility. But, again, as has been noted, a paper might have, simultaneously, a great idea or two (or valuable data) and a serious flaw – as a result, that paper might get cited for the former, but not for the latter.
Alexander Harvey says

14 Dec 2007 at 9:49 AM

Ray & Chuck,

I am trying to get to the bottom of the realities of the review process. Given that many papers are in essence desk-top activities that are easy to produce and require little depth of knowledge of the subtleties of the original work (as evidenced in the paper being tested here). The shear effort of getting them adequately reviewed seems intolerable and the amount of time available must surely be very limitted.

Unfortunately these type of papers are obscure in that short of retrieving the original data and reworking it, I doubt that more than an impression can be formed in the time available.

Also a paper of this nature as opposed to original research is rarely duplicated so that layer of verification is missing. They would become simply noise were it not for there value to contrarians. Now I think this is important.

Personally, I wish that the “hockey stick” had not been given the prominence it received as it has become a bit of an Aunt Sally. But it does exist and as long as it does papers like this one will recieve prominence well beyond their merit.

People do, whether rightly or wrongly, put great store on peer reviewed work.

I presume this work was reviewed and if so it does seem to make a bit of a mockery of the process.

It would seem that if the time and effort that was put in by Gavin had been done by reviewers pre-publication it is likely that we would have been saved the necessity of reading it and he of having to debunk it.

So not only does the process take up a good deal of time (of those that reviewed it) but it fails to prevent even more time being wasted post publication.

That alone would seem to be important.

Surely the review process is meant to be for the protection of good science from bad. Not just a tick in a box. If the latter is becoming the case perhaps the whoe process should come under review.

Best Wishes

Alexander Harvey
Lars Tranvik says

14 Dec 2007 at 10:34 AM

I am not a climatologist, but someone studying climate effects on ecosystems. I have a question regarding past reconstructions: A paper from 1992, Jawarowski et al. (Sci Tot Environ 114:227-284) criticizes the value of ice core CO2 data. I found some discussion of Jawarowski and his criticism at realclimate.org and elsewhere, but nothing explicit on the following issue: The paper is said to contain a graph (fig. 10) which has been reproduced and highlighted very recently by Swedish climate skeptics, and which claims that the ice data has been manipulated by moving data 80 years along the time scale to make the Mauna Loa time series fit with data from an ice core reflecting the last couple of centuries (Siple). I can only access more recent volumes of this journal, and have only seen this graph out of context. The paper has been cited only 7 times so far (according to ISI), and is apparently taken rather lightly by the research community. I would like a concise explanation (and maybe some good links/references) explaining this purported mismatch between ice data and modern measurements. Were data shifted for a good scientific reason, is the claim by Jawarowski that they shifted at all incorrect, or is there no good explanation for the mismatch?
Jack Garman - Amateur says

16 Dec 2007 at 4:17 PM

Hank Roberts and Timothy Chase have directed me to documents that seem to say the following:

‘The great fluctuations in ocean temperature are found to be less dramatic when data is more carefully assessed, although there is still a clear trend upward. However, the presence of this trend is not a stark indication of human impact because the useful data only covers four decades of change in systems that certainly operate over massively longer periods of time.’

Have I gotten on the right track here?