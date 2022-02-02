We have now updated the model-observations comparison page for the 2021 SAT and MSU TMT datasets. Mostly this is just ‘another dot on the graphs’ but we have made a couple of updates of note. First, we have updated the observational products to their latest versions (i.e. HadCRUT5, NOAA-STAR 4.1 etc.), though we are still using NOAA’s GlobalTemp v5 – the Interim version will be available later this year. Secondly, we have added a comparison of the observations to the new CMIP6 model ensemble.
As we’ve discussed previously, the CMIP6 ensemble contains a dozen models (out of ~50) with climate sensitivities that are outside the CMIP5 range, and beyond the very likely constraints from the observations. This suggests that comparisons to the observations should be weighted in some way. One reasonable option is to follow the work of Tokarska et al (2020) and others, and restrict the comparison to those models that have a transient climate response (TCR) that is consistent with observations. The likely range of TCR is 1.4ÂºC to 2.2ÂºC according to IPCC AR6, and so we plot both the mean and 95% spread over all all models (1 ensemble member per model) (grey) and the TCR-screened subset (pink).
These model simulations have only just been finished, and even though they only used observed boundary conditions (such as GHGs, aerosols, volcanoes and solar activity etc.) up until 2014, and scenarios thereafter (SSP2-4.5 in this figure), it is far too soon to say whether their projections have been validated or not. Additionally, the analysis of these models as an ensemble is only just beginning, and so there may be other methods for looking at this that are more useful.
For the longer term evaluation of this class of climate model, it’s useful to look at the earlier ensembles, such as CMIP3 (which were run almost two decades ago now!). They seem to be doing (surprisingly) well!
It’s perhaps important to remember why we maintain this archive. The point is not to claim that models are perfect or can’t be improved. Despite the impressive projections from the CMIP3 models seen above, the match to observations in CMIP6 is much better across a whole suite of important features (e.g. Orbe et al, 2020). However, we frequently see references to claims that models have never predicted anything successfully, or that such comparisons are rigged – frequently accompanied by misleading graphics that play games with the baselines, or hide the observational uncertainty, or don’t compare like-with-like. Our comparisons strive to be fair and since we have maintained them stably for many years, it should be clear that they aren’t being manipulated to adapt to changes in our understanding of the observational record.
In particular, the CMIP5 comparisons to the MSU/AMSU TMT records still show interesting discrepancies which remain a little enigmatic (even if the differences are not as large or as important as some people claim). For instance, it’s still not clear to what extent the specific internal variations of Pacific variability or the impact of inaccurate forcings or the uncertainties in parameterizing deep convection or stratospheric-tropospheric exchange, or further structural issues in the observations, or a combination of all, are responsible. These issues, and how to best to regard the CMIP6 ensemble, will be likely be more thoroughly explored in the coming months and years, and we will be able to tell, hopefully, how skillful the climate models are and hone our expectations for the future.
Joeri Rogelj says
Hi Gavin,
Always a pleasure to see these regular updates. These comparisons, in particular the one to the CMIP3 ensemble, might be even more compelling if accompanied by a simple graph showing assumed emissions or concentrations in the driving scenario and the historically observed ones.
Cheers
Mark BLR says
NB : I do not know how to phrase this so as not to come across as “snarky / overly cynical”, but note that while it is primarily labelled as “interesting / eyebrow raising” in my mind I do also classify it as “amusing” …
I agree that the (unadjusted) “CMIP3 (circa 2004)” graph is an impressive fit to the instrumental data records.
Why, then, does “CMIP5 (circa 2011)” require two graphs ?
1) A “forcing adjusted” version, and
2) A “[ forcing adjusted and ] using the blended SST/SAT product from the CMIP5 models produced by Cowtan et al (2015) instead of the pure SAT field” version
Using more technical language, why does the (older / obsolete) CMIP3 “ensemble mean + confidence interval” graph, with the common “Historical Data” emissions parameters used for all model runs only going up to 2000, not need “updates to reflect reality” to its forcing inputs … but the (newer / more up-to-date) CMIP5 one, with “Historical Data” both revised and updated to 2005, does ?
PS : Kudos for adding the “forcing adjusted” versions as dashed lines for “before and after” comparison purposes. All too often I go back to websites and find “updated / adjusted / homogenised” datasets have replaced the original projections / “Raw Data (V1)”.
Gavin says
There are multiple CMIP5 graphs because that was (and remains) the most studied ensemble (at least for the time being), and so we have options about what to plot. The issues addressed in those extra graphs – whether comparing SAT in a model is exactly the right thing to compare to the SAT/SST blend that the global observations, and whether the forcing data sets in CMIP5 were biased for some reason – are ones that people have thought about. We can illustrate the difference that it makes for the CMIP5 case. The SAT vs SAT/SST is relevant for CMIP3 and CMIP6 too, but as we can see is not a big factor. The forcing issue was specific to CMIP5, because that was more prescribed in CMIP5 than in CMIP3 and so any biases affected the whole ensemble. CMIP3 was a little looser (hence the wider spread over time) and there isn’t evidence of any systematic bias in the forcing. It’s not clear (yet) if there are sufficient issues in the CMIP6 forcing to make a difference (at this point I don’t think so – other than the COVID-related changes in 2020), – gavin
nigelj says
I have an observation of the graphs in the article made as a lay person and would appreciate feedback from anyone. The actual warming trend seems to be in the middle range of predictions, suggesting climate sensitivity MIGHT also be in the middle range of predictions (quantity of warming for doubling of CO2), thus “medium climate sensitivity” maybe around 3 degrees. However sea level rise appears to be tracking towards the UPPER range of predictions (from what I have read) and we have concerns about a relatively short term destabilisation of the west antarctic ice sheet in the media recently, and we see weather extremes becoming alarmingly more extreme, perhaps more than previously anticipated.
Does all this suggest that climate sensitivity might be in the middle, but the way the weather and ice sheets respond to that level of warming and sensitivity is greater than previously anticipated? Putting it another way weather and ice sheet behaviour is very sensitive to even small temperature changes more so than previously thought. Is there any formal science on all that?
albert says
In my opinion a relevant issue for scientific community is to understand why CMIP6 projections match to observations is much better on many important features and at the same time is definitely worse in comparison to CMPI3 regarding average global surface temperature trend.
Dominik LennÃ© says
Takeaway for me: That not one model but a whole subset of CMIP6 models shows a similar deviation from observations remains strange & I’m looking forward to see it explained (in ways I can understand it).
zebra says
Can someone explain why the observed temperature anomaly is different in the two graphs?
[Response: Top one is ÂºC, bottom is ÂºF. I like to mix it up sometimes… – gavin]
zebra says
Gavin,
I understand…. I always used to say “I did it to see if you students were paying attention”.