Many people hold the mistaken belief that reconstructions of past climate are the sole evidence for current and future climate change. They are not. However, they are very interesting and useful for all sorts of reasons: for modellers to test out theories of climate change, for geographers, archaeologists and historians to examine the impact of climate on past civilizations and ecosystems, and for everyone to get a sense of what climate is capable of doing, how fast it does it and why.
As a small part of that enterprise, the climate of the medieval period has received a very high (and sometimes disproportionate) profile in the public discourse – due in no small part to the mistaken notion that it is an important factor for the attribution of current climate change. Its existence as a period of generally warmer temperatures (at least in the Northern hemisphere) than the centuries that followed is generally accepted. But the timing, magnitude and spatial extent are much more uncertain. All previous multiproxy reconstructions indicate a Northern Hemisphere mean temperature less than current levels, though possibly on a par with the mid- 20th century. But there are only a few tenths of a degree in it, and so the description that it is likely to have been warmer now (rather than virtually certain) is used to express the level of uncertainty.
A confounding factor in discussions of this period is the unfortunate tendency of some authors to label any warm peak prior to the 15th Century as the ‘Medieval Warm Period’ in their record. This leads to vastly different periods being similarly labelled, often giving a misleading impression of coherence. For instance, in a recent paper it was defined as 1200-1425 CE well outside the ‘standard’ definition of 800-1200 CE espoused by Lamb.
Since a new ‘reconstruction’ of the last 2000 years from Craig Loehle is currently doing the rounds, we thought it might be timely to set out what the actual issues are in making such reconstructions (as opposed to the ones that are more often discussed), and how progress is being made despite the pitfalls.
The Loehle paper was published in Energy and Environment – a journal notable only for its rather dubious track record of publishing contrarian musings. The reconstruction itself is based on a network of 18 records that are purportedly local temperature proxies, and we will use those as examples in the points below. More discussion of this paper is available here (via the wayback machine).
Issue 1: Dating Nothing is more important than chronology in paleo-climate. If you can’t line up different records accurately, you simply can’t say anything about cause and effect or coherence or spatial patterns. Records where years can be accurately counted are therefore at a premium in most reconstructions. These encompass tree ring width, density and isotopes, some ice cores, corals, and varved lake sediments. The next most useful set of data are sources that have up to decadal resolution but that can still be dated relatively accurately. High- resolution ocean sediment cores can sometimes be found that fit this, as can some cave (speleothem) records and pollen records etc.
There are nonetheless more problems with the decadal data – they may have been smoothed by non-climate processes, and their dating may be off by a decade or two. But there are enough records that are widely enough dispersed to make them a useful adjunct in a reconstruction that hopes to capture decadal to multi-decadal variability.
Using data that has significantly worse resolution than that in reconstructions of recent centuries is asking for trouble. The age models tend to have errors in the 100′s of years, and the density of points rarely allows one to reach the modern instrumental period.
For instance, South-Eastern Atlantic ocean sediment data from Farmer et al (2005) (Loehle data series #17) nominally goes up to the present 0 calendar years. This is really 1950 due to the convention that years “Before Present’ (BP) almost invariably begin then (some recent papers use BP(2000) to indicate a different convention, but that is always specifically pointed out). However, the earliest real date for that core is 1053 BP, with a 2-sigma range of 1303 to 946 BP – almost 400 years! That makes this data completely unsuitable for reconstructions of the last 2000 years – which in all fairness, was certainly not the focus of the original paper.
Similar issues arises with data from DeMenocal et al (2000) (Loehle #10) and SSDP-102 (Kim et al , 2004) (Loehle #18). In the the first record, the initial data point nominally comes from 88 BP (i.e. 1862 CE), but the earliest dated sample is around 500 BP. In the second, the initial date is closer to the present (1940), but the age model is constrained by only 3 ages over the whole Holocene (and it’s not clear that any are within the last two millennia. So while both records have more apparent resolution than Farmer et al, their use in a reconstruction of recent paleo-climate is dubious.
It should probably be pointed out that the Loehle reconstruction has mistakenly shifted all three of these records forward by 50 years (due to erroneously assuming a 2000 start date for the ‘BP’ time scale). Additionally, the series used by Loehle for the Farmer et al data is not the SST reconstruction at all, but the raw Mg/Ca measurements! Loehle #12 (Calvo et al, 2002) is also off by 50 years, but since it doesn’t start until 1440 CE, its presence in this collection is surprising in any case. The dates on two other ocean sediment cores (Stott et al 2004 – #14 and #15) are on the correct scale thankfully, but are still marginal in terms of resolution (29 and 44 years respectively, but effectively longer still due to bioturbation of the sediments). Neither of them however extend beyond the mid-20th Century (end points of 1936 CE and 1810 CE) and so aren’t much use for looking at medieval-vs-modern data.
Other dating issues arise if the age model was tuned for some purpose. For longer time scale records, the dates are often tuned to ‘orbital forcing’ periods based on the understanding that precession and obliquity do have strong imprints in many records. However, in doing so, you remove the ability to assess with that record whether the orbital expression is leading or lagging another record. Since reconstructions of recent centuries are often pored over for signs of solar or volcanic forcing, it is crucial not to use those signals to adjust the age model. Unfortunately, the Mangini et al (2005) speleothem record (Loehle #9) was tuned to a reconstruction of solar activity so that the warm periods lined up with solar peaks. This invalidates its use on that age model for any useful reconstruction, since it would be assuming a relationship one would like to demonstrate. If put on a less biased age model, it could be useful however (but see issue 3 as well).
Issue 2: Fidelity
This issue revolves around what the proxy records are really recording and whether it is constant in time. This is of course a ubiquitous problem with proxies, since it well known that no ‘perfect’ proxy exists i.e. there is no real world process that is known to lead to proxy records that are only controlled by temperature and no other effect. This leads to the problem that it is unclear whether the variability due to temperature has been constant through time, or whether the confounding factors (that may be climatic or not) have changed in importance. In the case where the other factors seem to be climatic (d18O in ice cores for instance), the data can sometimes be related to some other large scale pattern – such as ENSO and could thus be an indirect measure of temperature change.
In many cases, proxies such as Mg/Ca ratios in foraminifera have laboratory and in situ calibrations that demonstrate a fidelity to temperature. However, some proxies, like d18O which do have a temperature component, also have other factors that affect them. In forams, the other factors involve changes in water mass d18O (correlated to salinity), or changes in seasonality. In terrestrial d18O records, the precipitation patterns, timing and sources are important – more so in the tropics than at high latitudes though.
A more prosaic, but still important, issue is the nature of what is being recorded. Low resolution data is often not a snapshot in time, but part of a continuous measurement. Therefore the 100 year spaced pollen reconstruction data from Viau et al (2006) (Loehle #13), are not estimates for the mid-point of each century, but are century averages. Linear interpolation between these points will give a series that actually has a different century-long means. The simplest approach is to use a continuous step function with each century given the mean, or a spline fit that preserves the average rather than the mid-point value. It’s not clear whether the low resolution series in Loehle (#4, #5, #6, #10, #13,#14, #15, #17, #18) were treated correctly (though to be fair, other reconstructions have made similar errors). It remains unclear how important this is.
Issue 3: Calibration
Correlation does not equal causation. And so a proxy with a short period calibration to temperature with no validating data cannot be fully trusted to be a temperature proxy. This arises with the Holmgren et al (1999) speleothem grey-scale data (Loehle #11) which is calibrated over a 17 year period to local temperature, but without any ‘out-of-sample’ validation. The problem in that case is exacerbated by the novelty of the proxy. (As an aside, the version used by Loehle is on an out-of-date age model (see here for an up-to-date version of the source grey-scale data – convert to temperature using T=8.66948648-G*0.0378378) and is already smoothed with a backwards running mean implying that the record should be shifted back ~20 years).
As mentioned above, there are a priori reasons to assume d18O records in terrestrial records have a temperature component. In mid-latitudes, the relationship is positive – higher d18O in precipitation in warmer conditions. This is a function of the increase in fractionation as water vapour is continually removed from the air. Most d18O records – in caves stalagmites, lake sediment or ice cores are usually interpreted this way since most of their signal is from the rain water d18O. However, only one terrestrial d18O record is used by Loehle (#9 Spannagel), and this has been given a unique negative correlation to temperature. This might be justified if the control on d18O in the calcite was from local cave temperature impact on fractionation, but the slope used (derived from a 5-point calibration) is more negative even than that. Unfortunately, no validation of this temperature record has been given.
Issue 4: Compositing
Given a series of records with different averaging periods, spatial representation and noise levels, there are a number of problems in constructing a composite. Equal averaging is simple but, for instance, implies giving equal weight to a century-mean North American continental average (Viau et al, Loehle #13) to a single decadally varying N. American point (Cronin et al, #3), despite the fact that one covers a vast area and time period and the other is much less representative. Unsurprisingly, the larger average has much less variability than the single point. To address this disparity, a common practice is to normalise the records by their standard deviation and to areally weight records – but without that, the more representative sample ends up playing a much smaller role.
Another approach, used implicitly in climate field reconstruction methods (like RegEM for instance), is to use current instrumental records to assess the relevance of any particular point, region or time period to the desired target. Another idea would be to estimate the changes in noise characteristics over larger areas and longer times and build that into the normalisation. That might also be useful for records whose resolution decays in time (the GRIP borehole temperature for instance, Loehle #1).
Finally, one needs to be very careful to deal with each series consistently. Treating an interpolated low-resolution record differently to another low-resolution record that wasn’t interpolated seems inconsistent. Keigwin’s Sargasso sea record is very low-resolution (Loehle #4) but Loehle appears to use it as though it was a real high-resolution record, while Kim et al (Loehle #18), which is equally low-res, is only used within 15 years of a datapoint.
Issue 5: Validation
It is inevitable that many seemingly ad-hoc decisions need to be made in building a particular reconstruction. This is not in itself cause for concern – the inhomogeneity of the data and its sparsity require that kind of consideration. Given that there is then no mathematically perfect way of doing this, the test of whether any particular approach is worthwhile lies in the validation i.e. does the reconstruction give a reasonable fit to the target field or index over a period or with data that wasn’t used in the calibration? There’s a good discussion of these issues in two recent papers in Climatic Change (Wahl and Amman, 2007; Amman and Wahl, 2007) in relation to the original Mann Bradley & Hughes papers. One would also like to test how sensitive the answers are to other equally sensible choices – a result can be considered robust if it is relatively insensitive to such methodological choices.
What does this imply for Loehle’s reconstruction? Unfortunately, the number of unsuitable series, errors in dating and transcription, combined with a mis-interpretation of what was being averaged, and a lack of validation, do not leave very much to discuss. Of the 18 original records, only 5 are potentially useful for comparing late 20th Century temperatures to medieval times, and they don’t have enough coverage to say anything significant about global trends. It’s not clear to me what impact fixing the various problems would be or what that would imply for the error bars, but as it stands, this reconstruction unfortunately does not add anything to the discussion.
So where does this all leave us? Since the early days of multi-proxy reconstructions a decade ago the amount of suitable data has definitely increased, and so many of the issues related to specific proxies are becoming increasingly unimportant. As the amount of data grows, the picture of climate in medieval times will likely become clearer. What seems even more likely is that the structure of the climate anomalies will start to emerge. The simple question of whether the medieval period was warm or cold is not particularly interesting – given the uncertainty in the forcings (solar and volcanic) and climate sensitivity, any conceivable temperature anomaly (which remember is being measured in tenths of a degree) is unlikely to constrain anything.
However, if the tantalising link between medieval American mega-droughts and potential long-term La Nina conditions in the Pacific can be better characterised, that could be very useful at constraining ENSO sensitivity to climate change – something of great interest to many people. That will have to wait for a better next-generation reconstruction though.
Thanks to Eric Swanson for helping find some of the more interesting choices made by Loehle in his reconstruction, and Karin Holmgren for swift responses to my queries about her data
Update (Jan 22): Loehle has issued a correction that fixes the more obvious dating and data treatment issues, but does not change the inappropriate data selection, or the calibration and validation issues.