Is Climate Modelling Science?

12 Jan 2005 by Gavin

At first glance this seems like a strange question. Isn’t science precisely the quantification of observations into a theory or model and then using that to make predictions? Yes. And are those predictions in different cases then tested against observations again and again to either validate those models or generate ideas for potential improvements? Yes, again. So the fact that climate modelling was recently singled out as being somehow non-scientific seems absurd.
par Gavin Schmidt (traduit par Gilles Delaygue)

A première vue, cela semble une question étrange. Est-ce-que la science n’est pas précisément la quantification d’observations dans une théorie ou un modèle, et ensuite son utilisation pour faire des prédictions ? Oui. Et est-ce-que ces prédictions de différents cas sont ensuite confrontées, maintes fois, aux observations, afin soit de valider ces modèles ou bien de faire émerger des idées d’amélioration ? Oui, encore une fois. Ainsi la mise à l’index récente de la modélisation climatique comme quelque chose de non scientifique semble absurde.

(suite…)

Granted, the author of the statement in question has little idea of what climate modelling is, or how or why it’s done. However, that his statement can be quoted in a major US newspaper says much about the level of public knowledge concerning climate change and the models used to try and understand it. So I will try here to demonstrate how the science of climate modelling works, and yes, why it is a little different from some other kinds of science (not that there’s anything wrong with that!).

Climate is complex. Since climatologists don’t have access to hundreds of Earth’s to observe and experiment with, they need virtual laboratories that allow ideas to be tested in a controlled manner. The huge range of physical processes that are involved are encapsulated in what are called General Circulation Models (or GCMs). These models consist of connected sub-modules that deal with radiative transfer, the circulation of the atmosphere and oceans, the physics of moist convection and cloud formation, sea ice, soil moisture and the like. They contain our best current understanding for how the physical processes interact (for instance, how evaporation depends on the wind and surface temperature, or how clouds depend on the humidity and vertical motion) while conserving basic quantities like energy, mass and momentum. These estimates are based on physical theories and empirical observations made around the world. However, some processes occur at scales too small to be captured at the grid-size available in these (necessarily global) models. These so-called ‘sub-gridscale’ processes therefore need to be ‘parameterised’.

A good example is related to clouds. Obviously, in an actual cloud, the relative humidity is close to 100%, but at a grid box scale of 100’s of km, the mean humidity – even if there are quite a few clouds – will be substantially less. Thus a parameterisation is needed that relates the large scale mean values, to actual distribution of clouds in a grid box that one would expect. There are of course many different ways to do that, and the many modelling groups (in the US, Europe, Japan, Australia etc.) may each make different assumptions and come up with slightly different results.

It’s important to note what these models are not good for. They aren’t any good for your local weather, or the temperature of the water at the nearest beach or for the wind in downtown Manhattan, because these are small scale features, affected by very local conditions. However, if you go up to the regional scale and beyond (i.e. Western Europe as a whole, the continental US) you start to expect better correlations.

One of the most important features of complex systems is that most of their interesting behaviour is emergent. It’s often found that the large scale behaviour is not a priori predictable from the small scale interactions that make up the system. So it is with climate models. If a change is made to the cloud parameterisation, it is difficult to tell ahead of time what impact that will have on, for instance, the climate sensitivity. This is because the number of possible feedback pathways (both positive and negative) is literally uncountable. You just have to put it in, let it physics work itself out and see what the effect is.

This means that validating these models is quite difficult. (NB. I use the term validating not in the sense of ‘proving true’ (an impossibility), but in the sense of ‘being good enough to be useful’). In essence, the validation must be done for the whole system if we are to have any confidence in the predictions about the whole system in the future. This validation is what most climate modellers spend almost all their time doing. First, we look at the mean climatology (i.e. are the large scale features of the climate reasonably modelled? Does it rain where it should, is there snow where there should be? are the ocean currents and winds going the right way?), then at the seasonal cycle (what does the sea ice advance and retreat look like? does the inter-tropical convergence zone move as it should?). Generally we find that the models actually do a reasonable job (see here or here for examples of different groups model validation papers) . There are of course problematic areas (such as eastern boundary regions of the oceans, circulation near large mountain ranges etc.) where important small scale processes may not be well understood or modelled, and these are the chief targets for further research by model developers and observationalists.

Then we look at climate variability. This step is key, but it is also quite subtle. There are two forms of variability: intrinsic variability (that occurs purely as a function of the internal chaotic dynamics of the system), and forced variability (changes that occur because of some external change, such as solar forcing). Note that ‘natural’ variability includes both intrinsic and forced components due to ‘natural’ forcings, such as volcanoes, solar or orbital changes. A clean comparison relies on either being able to isolate just one reasonably known forcing, or having enough data to be able to average over many examples and thus isolate the patterns associated solely with that forcing, even though in any particular case, more than one thing might have been happening. (A more detailed discussion of these points is available here).

While there is good data over the last century, there were many different changes to planet’s radiation balance (greenhouse gases, aerosols, solar forcing, volcanoes, land use changes etc.), some of which are difficult to quantify (for instance the indirect aerosol effects) and whose history is not well known. Earlier periods, say 1850 going back to the 1500s or so, have reasonable coverage from paleo-proxy data, and only have solar and volcanic forcing. In my own group’s work, we have used the spatial patterns available from proxy reconstructions of this period to look at both solar and volcanic forcing in the pre-industrial period. In both cases, despite uncertainties (particularly in the magnitude of the solar forcing), the comparisons are encouraging.

Recent volcanos as well have provided very good tests of the model’s water vapour feedbacks (Soden et al, 2002), dynamical feedbacks (Graf et al., 1994; Stenchikov et al., 2002), and overall global cooling (Hansen et al, 1992). In fact, the Hansen et al (1992) paper actually predicted the temperature impact of Pinatubo (around 0.5 deg C) prior to it being measured.

The mid-Holocene (6000 years ago) and Last Glacial Maximum (~20,000 years ago) are also attractive targets of model validation, and while some successes have been noted (i.e. Joussaume et al, 1999, Rind and Peteet, 1985) there is still some uncertainty in the forcings and response. Other periods such as the 8.2kyr event, or the Paleocene-Eocene Thermal Maximum are also useful, but clearly as one goes further back in time, the more uncertain the test becomes.

The 20th Century though still provides the test that appears to be most convincing. That is to say, the models are run over the whole period, with our best guesses for what the forcings were, and the results compared to the observed record. If by leaving out the anthropogenic effects you fail to match the observed record, while if you include them, you do, you have a quick-and-dirty way to do ‘detection and attribution’. (There is a much bigger literature that discusses more subtle and powerful ways to do D&A, so this isn’t the whole story by any means). The most quoted example of this is from the Stott et al. (2000) paper shown in the figure. Similar results can be found in simple models (Crowley, 2000) and in more up to date models (Meehl et al, 2004).

It’s important to note that if the first attempt to validate the model fails (e.g. the signal is too weak (or too strong), or the spatial pattern is unrealistic), this leads to a re-examination of the physics of the model. This may then lead to additional changes, for example, the incorporation of ozone feedbacks to solar changes, or the calculation of vegetation feedbacks to orbital forcing – which in each case improved the match to the observations. Sometimes though it is the observations that turn out to be wrong. For instance, for the Last Glacial Maximum, model-data mis-matches highlighted by Rind and Peteet (1985) for the tropical sea surface temperatures, have subsequently been more or less resolved in favour of the models.

So, in summary, the model results are compared to data, and if there is a mismatch, both the data and the models are re-examined. Sometimes the models can be improved, sometimes the data was mis-interpreted. Every time this happens and we get improved matches between them, we have a little more confidence in their projections for the future, and we go out and look for better tests. That is in fact pretty close to the textbook definition of science.

Je vous l’accorde, l’auteur de la déclaration en question a très peu idée de ce qu’est la modélisation climatique, ou de comment et à quoi elle sert. Pourtant, que sa déclaration puisse être rapportée dans un grand journal américain en dit beaucoup sur le niveau de connaissance du public du changement climatique et des modèles utilisés pour tenter de le comprendre. Ainsi je vais essayer ici de démontrer comment fonctionne la science de la modélisation climatique, et, effectivement, pourquoi elle est un peu différente d’autres types de sciences (sans qu’il y ait quoi que soit d’anormal à cela !).

Le climat est complexe. Comme les climatologues n’ont pas accès à des centaines de Terre pour les observer et faire des expériences, ils ont besoin de laboratoires virtuels permettant de tester des idées de façon controlée. L’immense palette de processus physiques impliqués est encapsulée dans ce que l’on appelle les modèles de circulation générale (ou MCG). Ces modèles sont constitués de modules connectés traitant du transfert radiatif, de la circulation de l’atmosphère et des océans, de la physique de la convection humide et de la formation des nuages, de la glace de mer, humidité du sol et ainsi de suite. Ils contiennent notre meilleure compréhension actuelle de l’interaction des processus physiques (par exemple, comment l’évaporation dépend du vent et de la température de surface, ou comment les nuages dépendent de l’humidité et du mouvement vertical), en conservant les quantités de base telles que l’énergie, la masse et le moment. Ces estimations sont basées sur des théories physiques ainsi que des observations empiriques réalisées partout dans le monde. Néanmoins, certains processus interviennent à des échelles trop petites pour être décrits avec la taille de grille disponible dans ces modèles (nécessairement globaux). Ces processus ‘sous-échelles’ nécessitent ainsi d’être ‘paramétrisés’.

Un bon exemple est lié aux nuages. Evidemment, dans un vrai nuage, l’humidité relative est proche de 100%, mais à l’échelle d’une maille de centaines de kilomètres, l’humidité moyenne –même s’il y a un certain nombre de nuages– sera nettement plus faible. Ainsi, une paramétrisation est nécessaire pour relier les valeurs moyennes à grande échelle avec la distribution réelle de nuages à laquelle on devrait s’attendre dans une maille. Il y a bien sûr de nombreuses façons différentes de faire ça, et les nombreux groupes de modélisation (aux Etats-Unis, en Europe, Japon, Australie, etc) pourront faire chacun des hypothèses différentes et arriver à des résultats légèrement différents.

Il est important de noter ce pour quoi ces modèles ne sont pas bons. Ils ne sont bons en rien pour votre météo locale, ou la température de l’eau de la plage du coin ou le vent à Manhattan centre, parce que ceci correspond à des caractéristiques à petite échelle, affectées par des conditions très locales. Maintenant, si vous passez à l’échelle régionale ou au-dessus (par ex. l’Europe occidentale comme un tout, le continent US) on commence à s’attendre à de meilleures corrélations.

L’une des caractéristiques les plus importantes de systèmes complexes est que la plus grande part de leur comportement intéressant est émergente. On s’aperçoit souvent que le comportement à grande échelle n’est pas prédictible a priori à partir des interactions à petites échelles qui composent le système. Il en est ainsi avec les modèles climatiques. Si on effectue un changement à la paramétrisation des nuages, il est difficile de dire à l’avance quel va en être l’impact sur, par exemple, la sensibilité climatique. C’est parce que le nombre de rétroactions possibles (positives et négatives) est littéralement incommensurable. Vous devez juste intégrer ce changement, laisser faire la physique, et voir quel en est l’effet.

Ce qui veut dire que valider ces modèles est très difficile. (NB. J’utilise le terme valider non pas dans le sens de ‘prouver la justesse’ –ce qui impossible–, mais dans le sens de ‘suffisamment bon pour être utile’). Par principe, la validation doit concerner le système complet si l’on veut avoir un peu confiance dans les prédictions du système complet dans le futur. Cette validation est ce sur quoi la majorité des modélisateurs du climat passe pratiquement tout leur temps. D’abord, nous regardons le climat moyen (c’est-à-dire, est-ce-que les grandes caractéristiques sont raisonnablement modélisées ? Est-ce-qu’il pleut là où il faut, est-ce-qu’il neige à la bonne place ? Est-ce-que les courants océaniques et les vents vont dans la bonne direction ?), ensuite la saisonnalité (à quoi ressemblent l’avancée et le recul de la glace de mer ? Est-ce-que la convergence intertropicale bouge comme il faut ?). De façon générale, nous trouvons que les modèles font finalement du bon boulot (voir ici ou ici pour des exemples d’articles de validation de modèle par différents groupes). Il y a bien sûr des régions posant problèmes (comme les régions de bord est des océans, la circulation à proximité des chaînes de montagne, etc) pour lesquelles des processus de petite échelle peuvent être mal compris ou modélisés, et ce sont les buts principaux de recherche future des développeurs de modèles et des observateurs.

Ensuite, nous regardons la variabilité. C’est une étape clé, mais aussi subtile. Il y a deux formes de variabilité: la variabilité intrinsèque (qui provient uniquement de la dynamique chaotique interne au système), et la variabilité forcée (changements provenant de changements externes, comme le forçage solaire). Notez que la variabilité ‘naturelle’ inclut à la fois les composants intrinsèque et forcé dus aux forçages ‘naturels’, comme les volcans, les changements solaires ou orbitaux. Une comparaison propre est garantie soit par la capacité à isoler juste un forçage raisonnablement bien connu, ou d’avoir suffisamment de données pour pouvoir moyenner sur plusieurs exemples et ainsi isoler la structure associée seulement avec ce forçage, même si dans chaque cas, plusieurs origines sont possibles. (une discussion plus détaillée de ces points est accessible ici.)

Alors qu’il y a de bonnes données sur le dernier siècle, il y a eu plusieurs changements différents de l’équilibre radiatif de la planète (gaz à effet de serre, aérosols, forçage solaire, volcans, changements de l’utilisation du sol, etc), certains d’entre eux étant difficiles à quantifier (par exemple l’effet indirect des aérosols) et dont l’histoire n’est pas bien connue. Les périodes plus anciennes, disons de 1850 aux années 1500 à peu près, ont une couverture raisonnable par des données liés à des paléo-proxies, et ont seulement un forçage solaire et volcanique. Dans le travail de mon propre groupe, nous avons utilisé les structures spatiales issues des reconstructions de cette période basées sur des proxies, pour regarder à la fois les forçages solaire et volcanique de la période pré-industrielle. Dans les deux cas, malgré des incertitudes (particulièrement sur l’amplitude du forçage solaire), les comparaisons sont encourageantes.

Des éruptions volcaniques récentes, également, ont fourni de très bons tests des rétroactions de la vapeur d’eau des modèles (Soden et al, 2002), des rétroactions dynamiques (Graf et al., 1994; Stenchikov et al., 2002), et du refroidissement global complet (Hansen et al, 1992). En fait, l’article de Hansen et al. (1992) prédisait vraiment l’impact en température du Pinatubo (environ 0,5 ºC) avant qu’il soit mesuré.

L’Holocène moyen (il y a 6000 ans) et le dernier maximum glaciaire (il y a ~20 000 ans) sont aussi des objectifs attractifs de validation des modèles, et tandis que certains succès sont à noter (c’est–à-dire Joussaume et al, 1999, Rind et Peteet, 1985) il y a encore des incertitudes sur les forçages et les réponses. D’autres périodes comme l’événement à 8200 ans, ou le maximum thermique Paléocène-Eocène, sont aussi utiles, mais clairement plus on remonte loin dans le temps, plus le test devient incertain.

Le 20ie siècle, cependant, fournit toujours le test apparaissant comme le plus convaincant. C’est-à-dire que les modèles tournent sur toute la période, avec nos meilleures estimations des forçages, et les résultats sont comparés avec les enregistrements d’observations. Si en excluant les effets anthropogéniques vous n’arrivez pas à reproduire les observations, tandis qu’en les incluant, vous y arrivez, vous avez un moyen simple et grossier de faire de la ‘détection et attribution’. (Il y a une beaucoup plus grosse littérature discutant de meilleures et plus puissantes façons de faire de la D&A, donc ce n’est pas du tout l’histoire complète.)
L’exemple le plus cité là-dessus est dans l’article de Stott et al. (2000), montré sur la figure. Des résultats similaires peuvent être trouvés avec des modèles simples (Crowley, 2000) et dans des modèles plus à jour (Meehl et al., 2004).

Il est important de noter que si le premier essai de validation du modèle échoue (par ex. le signal est trop faible –ou trop fort–, ou la structure spatiale n’est pas réaliste), ceci amène à un ré-examen de la physique du modèle. Ceci peut alors amener à quelques changements, par exemple l’incorporation de rétroactions entre ozone et insolation, ou le calcul de rétroactions entre forçage orbital et végétation –qui dans les deux cas améliorent la reproduction des observations. Parfois, cependant, ce sont les observations qui s’avèrent fausses. Par exemple, pour le dernier maximum glaciaire, la différence entre modèles et données soulignée par Rind et Peteet (1985) sur les températures de surface de l’océan tropical, a été par la suite plus ou moins résolue en faveur des modèles.

Donc, en résumé, les résultats de modèles sont comparés aux données, et s’ils ne correspondent pas, à la fois les données et les modèles sont ré-examinés. Parfois les modèles sont améliorés, parfois les données ont été mal interprétées. Chaque fois que cela arrive et que nous obtenons une meilleure correspondance entre eux, nous avons un peu plus confiance dans les projections des modèles pour le futur, et nous cherchons de meilleurs tests. C’est en fait plutôt proche de la définition classique de la science.

About Gavin

50 Responses to "Is Climate Modelling Science?"

Dano says

12 Jan 2005 at 12:33 PM

Fix your tag for your hyperlink in the second para., guys. [Done].

Nice piece. I like the emergent behavior bit.

D
Brent says

12 Jan 2005 at 1:16 PM

I’ve seen many news reports over the past couple of years about extreme weather conditions which break all previous records, e.g, Europe’s heat wave in summer 2003.

I know climate models do not predict weather as such, but do any major models predict greater variability in weather due to climate change?

Have any studies attempted to link these extremes to anthropogenic climate forcing, or do climatologists not concern themselves with outliers like these? Are these conditions really ‘extreme’ on a longer time scale?

I’d wager Californians particularly may be wondering about this right now.
Watchful Babbler says

12 Jan 2005 at 1:39 PM

The two most common arguments against warming theories seem to be (1) local temperature variations (or mutually-inconclusive data) disprove global warming itself; and (2) models aren’t real science, anyway, so we don’t need to worry about them.

So, to be clear on this, are there *any* validated models that show significant deviation of climatic conditions from the consensus predictions regarding the effect of anthropogenic forcings?

[Response: No, there are not. To clairfy though, some deviations are seen in different models, and these may well be ‘significant’ scientifically, but assuming you are talking about the general sense of the response (at the global scale), all models show very similar behaviour. – gavin]
Vish Subramanian says

12 Jan 2005 at 2:10 PM

In my mind, the question is – how many input parameters do the models use? If the number of inputs is many, then the model can be tweaked easily to match past data without giving us much confidence that the future is accurately predicted.

It does seem that today,the input into climactic models (like the cloud parametrization example) runs in to the hundreds. Almost equal to the number of outputs, it seems.

The major characterisitic of useful and “scientific” models is that the number of inputs is low, and the accuracy is still high.

Vish
DrMaggie says

12 Jan 2005 at 2:57 PM

What I find interesting about climate models is not only their ability to project the overall ‘average properties’ of past, observed climate,
but that they also sem to do a pretty good job of reproducing the actual shape of the climate variable distributions. Thus, not only the centroids but also the distribution widths are available in the model outputs, for instance on seasonal or monthly basis, which makes for interesting comparisons with data (instrumental or from proxy records).

If I have understood e.g. the IPCC report correctly, it is not only what climate change will do to the distribution centroids (e.g. induce an overall annual average warming of, say, the nothern hemisphere air temperature) that is important, but what goes on at the tails (upper and lower) of the distribution. (See, for example, this interesting figure.)

The consequences for ecosystems, both natural and man-made (agriculture, forestry), of even relatively small or moderate changes in climate have a potential to be very large, especially if the change occurs rapidly. Even a relatively small upward shift of summer maximum temperatures could be devastating for an ecosystem, and even more so, if, for instance, this coincides with a tendency for lower minimum winter temperature, or a lowering of the typical spring precipitation level.

As pointed out in the man posting, we don’t have a large number of Earths to experiment with, so models are all we have to explore the possible outcomes of e.g. increased atmospheric levels of CO2 or nitrous oxide. When the climate model output is fed into ecosystem models, and these in turn are coupled to socio-economic analysis tools, the potential future scenarios that come out, assuming the world continues its business as usual, appear rather grim, see e.g. the very interesting final report of the European ATEAM project.
James B. Shearer says

12 Jan 2005 at 3:17 PM

I doubt the number of possible feedback pathways in a model is actually literally uncountable.

[Response: in the mathematical sense, no. In the sense of no-one-could-count-them, perhaps… – William]
Watchful Babbler says

12 Jan 2005 at 3:36 PM

Sorry, imprecision in my comment. Yes, I meant the common use of “significant,” not the statistical sense.
Thor Olson says

12 Jan 2005 at 4:00 PM

Nice article. What has always concerned me about the majority of the climate models are reliance on prior climatology and not looking for new patterns. To me they are clearly scientific yet to what end?

Right now there are 4 persistant cold spots and 4 warmer than usual spots in the oceans of Southern hemisphere at nicely spaced intervals. This matches almost perfectly in the Northern hemisphere making it almost a textbook diagram of the world weather. If your model is looking for evidence to match this, then it’s predictive power is greatly amplified. If it is predicated on 50 years of climate patterns including El Nino’s then it may be skewed in a way that does not give the best picture. I have never seen this weather pattern except in theoretical paleoclimate models.
Bruce Frykman says

12 Jan 2005 at 7:01 PM

One might ask the same of business models, are they scientific?

To me science predicts future outcomes while models project outcomes based upon a subset of currently held beliefs. Whether the beliefs are correct is not addressed by a model.

I do not believe anyone has made the claim that future climates are predictable through the use of models and are therefore clearly unscientific by my definition of science. I do not mean to say that scientists can’t fiddle with them to gain insight but in of themselves they are unscientific entities.

I also quarrel with the idea that a historical “record” of the earths mean surface temperature exists either now or can be reconstructed for 1850 etc. To me a “record” consists of systemitized data taken by some uniform and appropriate mechanism for the purpose at hand. This is clearly not the case with the hodge-podge of inconsistant anacdotal references to daily highs and lows in locations purely accidental for scientific purpose. Generalization might be permitted, but records and generalizations are different things.

[Response: Restricting “science” to predicting future outcomes is too strong: it would render observational astronomy or evolution as non-science, which is obviously wrong.
The historical temperature record isn’t perfect, but nor is it the hodge-podge you describe. “Anecdotal” is purely wrong – you may be confusing the instrumental record with the phenomenological one. Most of the records *were* collected for a scientific purpose – usually weather forecasting. Unfortunately I can’t find a good link for the data collection and processing… another post perhaps – William]
John Fleck says

12 Jan 2005 at 7:46 PM

Bruce –

As William noted in his response, your definition of science would clearly exclude large areas of things that I certainly consider “science,” and that I think you would be uncomfortable excluding.

Seismologists do not pretend that their models of tectonic forces can predict, yet I assume you’d not want to exclude seismology as not being “science”. In fact, geology has long functioned in significant measure as a descriptive rather than predictive science. Models play a huge role. To follow your line of argument, you’d have to throw out vast areas of earth science.

Likewise anthropology and archaeology, which do not attempt to predict future outcomes. And ecology. And cosmology. All of these people are doing science, it seems to me. If they’re not doing science, they’re doing something else that’s awfully useful to our understanding of the world around us. Simply defining them away as “unscientific” doesn’t make them any less useful.

Naomi Oreskes has an excellent discussion of this “science=prediction” issue in an essay in Prediction: Science, Decision Making, and the Future of Nature.
Michael Tobis says

12 Jan 2005 at 8:52 PM

I agree with # 6. The word “literally” seems here used figuratively, which isn’t a good idea. Climate is arguably an infinite collection of inifinitesimals, but climate models aren’t.

mt
Michael Tobis says

12 Jan 2005 at 9:52 PM

The great importance of GCMs and CGCMs as compared to simpler models is due to the fact that their global properties are emergent, rather than programmed in as with simpler models. We define the small-scale processes as well as we can and try to get the large scale processes to emerge. This has been possible for the atmosphere alone and the ocean alone for about a quarter century, and for the combination of atmosphere and ocean for about a decade. Such models cannot be regarded as simple exercises in curve-fitting. The fact that this is possible indicates that we have achieved a fundamental understanding of the system, at least on annual to decadal time scales.

Nevertheless, the question asked in #4 is well-posed and crucial. The answer is that while the number of explicit or implicit input parameters into a GCM is in the hundreds at least, the number of degrees of freedom in the output model is enormous . What we see emerging is storms, fronts, the structure of the trposphere and stratosphere, precipitation patterns, soil moisture patterns, etc. etc. These are not just a few numbers but great patterns and patterns of patterns.

That we get these approximately right is a measure of what we know. That we don’t get them exactly right is a measure of what we don’t know. However, it is true enough that the more controls we have over these models, the more we can fudge the results. There is, therefore, a danger that we can add too many “knobs”, and ultimately end up studying the model rather than nature.

I think this is an important criticism from the perspective of science. There is room for concern that the models are overtuned – that the number of processes in the models allow the successful replication of the 20th century record to look better than it is, and that allow the broad agreement of CGCMs to look like a more independent set of confirmations than they actually are. James Shearer has argued that this point was reached a decade ago. I disagreed then and stand by this position. However, as more phenomena are added, the point that Vish Subramaniam is alluding to now and Shearer advocated some time ago becomes more important.

For what my marginally qualified opinion is worth, I think that for a number of purposes including informing public policy, we need better simple GCMs, not just bigger more tunable models (though these will continue to have important uses). However, this criticism doesn’t detract as much as one might suspect from the role of current GCM results in driving policy.

As Gavin points out in his response to #3, the broad overall behavior of the GCM type models subject to greenhouse forcing is unanimous. This unanimity precedes any fine tuning. Nobody has any idea how to build a GCM that 1) reproduces contemporary climate even approximately correctly and 2) fails to show alarming sensitivity to greenhouse gases.

One presumes this is not for want of trying. Anyone who succeeds in doing so will have made a very valuable contribution to both science and policy. That’s very hypothetical, though. On present evidence, such an achievement is most likely impossible, because the prediction of accelerating climate disruption in response to accelerating forcing is, apparently and unfortunately, robust.

Regarding this most prominent (far from sole) purpose of these models, climate prediction into the next century, there is little doubt that there is a range of anthropogenic forcings for which they are qualitatively correct. There is increasing room for doubt as the perturbation that the models attempt to capture becomes very large. Nonspecialists arguing that the “models are no good” tend to draw exactly the wrong conclusion from this. In fact, if the models should prove wrong for large perturbations they would (almost certainly) be wrong because they fail to capture very large shifts in the climate system that may be ahead. Larger shifts are likely to be more dangerous shifts. The less reliable the models are, the more weight should be given to the risk side of the risk/benefit calculation.

mt
dave says

12 Jan 2005 at 10:28 PM

Re: Models And Prediction

Great post.

As far as modelling not being scientific goes, that’s ridiculous. A model is a theory of climate (in this case) that has been implemented in software and that can be validated (or not) against existing data sets (meaning measurements of the real world – past, historical, recent).

I’m surprised you weren’t a little more specific about what it is the climate models are trying to predict. It is my understanding that the models are attempting to predict the climate sensitivity to doubled CO2 (whenever that occurs), where climate sensitivity is one of these:

1. Equilibrium climate sensitivity : The global mean surface-air temperature warming achieved at long-term equilibrium, for a doubling of atmospheric CO2 over pre-industrial levels, commonly set at 280 parts per million by volume (ppmv).

2. Transient climate sensitivity : The global mean surface-air temperature achieved when atmospheric CO2 concentrations achieve a doubling over pre-industrial CO2 levels increasing at the assumed rate of one percent per year, compounded.

Source Estimating Climate Sensitivity (2003).

The amount CO2 ppmv in the atmosphere is an input to the modelling software, right?

Also of interest is inter-model comparisons, which you did not discuss.
dave says

13 Jan 2005 at 12:34 AM

I hadn’t seen the Tobias post (#12) when I wrote my post (#13).

Michael says

However, it is true enough that the more controls we have over these models, the more we can fudge the results. There is, therefore, a danger that we can add too many “knobs”, and ultimately end up studying the model rather than nature.

OK, let’s discuss this for a moment. I have been a software guy for many years. As I said, a GCM is a theory of climate implemented in software run against a data set. If you tweak the model, you have to run regression tests to see if it still gives a good fit with your data in all cases. I don’t care how many controls you add, if it runs and gives a good fit with all the data sets (paleo,historical,recent), then it is a viable theory of climate. That’s it. The climate system is complex. There are no doubt many, many inputs. And outputs are “emergent”, as noted.

Now, if you’re fudging things just to get the “Hockey Stick” (but missing the 8.2ky event, the “Little Ice Age” or the PETM), then that’s just wrong and not scientific. But if you’re getting some reasonable consistency across the board after running the regression tests, then you’ve got a working theory of climate.

[Response: The statement ‘to get the “Hockey Stick” (but missing…the “Little Ice Age”)’ is wrongly premised. Refer to this post (and references therein) for more discussion. The “Hockey Stick”, or perhaps better, the “Hockey Team”–that is, the the multiple independent reconstructions indicating essentially the same pattern of hemispheric mean temperature variation in past centuries, show moderate (a few tenths of a degree cooling) during the so-called “Little Ice Age”). At the hemispheric-mean scale, the “Little Ice Age” is only a moderate cooling because larger offsetting regional patterns of temperature change (both warm and cold) tend to cancel in a hemispheric or global mean. Modelers of course do not compare just hemispheric mean series, but the actual spatial patterns of estimated and observed climate changes in past centuries. See e.g. the review paper: Schmidt, G.A., D.T. Shindell, R.L. Miller, M.E. Mann, and D. Rind 2004. General circulation modelling of Holocene climate variability. Quaternary Sci. Rev., 23, 2167-2181, doi:10.1016. In this paper, the model used is shown to match the spatial pattern of reconstructed temperature change during the “Little Ice Age” (which includes substantial regional cooling in regions such as Europe) as well as the smaller hemispheric-mean changes (the “Hockey Team” if you will) -mike]

You can’t know what a good theory will look like in advance, obviously. It may be very messy (and I’ll bet it is).

So, if you’ve done the appropriate regression testing of the model and everything seems to be working out, then you’ve got a working theory. On the other hand, if the model is overly messy (ad-hoc), then you’re probably missing something. This is a judgement call but it is the way science works.

If you’ve got a working theory – fits reasonably for all data sets – no matter how messy, then how is it the case that you are studying the model and not nature?

If I’m confused about something here, I’m sure you’ll let me know.
Tom Rees says

13 Jan 2005 at 5:24 AM

Vish (msg 4) points out that models can be tweaked to match observations, without necessarily being true. This is the case. However, they can’t be tweaked infinitely and yet still match observations, and as others have pointed out no-one has yet been able to tweak a GCM to show negligible warming with the predicted increase in greenhouse gases.

So the question is, what range of predictions can we get out of climate models by tweaking them in different ways so that they still match observations. This is the question that is answered by ensemble modelling. Interested folks should look up the publications available online at climateprediction.net, starting with Peter Stott & Jamie Kettleborough, Origins and estimates of uncertainty in predictions of twenty-first century temperature rise, Nature, 416, pp.719-723, 18 April 2002.
John Finn says

13 Jan 2005 at 5:44 AM

Gavin

I agree with many of the other posters. This is the sort of article that this site should be about.

I just have one (not particularly relevant) question . You say

In fact, the Hansen et al (1992) paper actually predicted the temperature impact of Pinatubo (around 0.5 deg C) prior to it being measured

I thought I read that the temp impact was 0.3 deg C (not 0.5) but that ‘s not important for the moment. My question is; over what time period was there an impact on temperatures due to Pinatubo. Do you mean that the global mean temperature for the whole of the following year (1992?) was lower – or is it part of one year and part of the next (e.g. 1992/93) – or was it just a few months. Sorry, I’ve phrased the question a bit clumsily – but I hope it makes sense.
Tom Rees says

13 Jan 2005 at 5:48 AM

Oops sorry – the one you should start with is Reto Knutti, Thomas Stocker, Fortunat Joos & Gian-Kasper Plattner, Constraints on radiative forcing and future climate change from observations and climate model ensmbles, Nature 416,
18 April 2002.
John Finn says

13 Jan 2005 at 10:36 AM

Post #16 did contain a question which I’ll repeat below.

RE: Mt Pinatubo eruption

I thought I read that the temp impact was 0.3 deg C (not 0.5) but that ‘s not important for the moment. My question is; over what time period was there an impact on temperatures due to Pinatubo. Was the global mean temperature for the whole of the following year (1992?) was lower by 0.5 (or 0.3) – or is it part of one year and part of the next (e.g. 1992/93) – or was it just a few months. Sorry, I’ve phrased the question a bit clumsily – but I hope it makes sense

thanks

[Response: The global cooling peaked at around 0.5 deg C in the second half of 1992, and temperatures returned to pre-Pinatubo levels by the end of 1994. See http://www.giss.nasa.gov/data/update/gistemp/GLB.Ts.txt for the actual data. -gavin]
Lee A. Arnold says

13 Jan 2005 at 2:20 PM

This is a really terrific encapsulation of the issues. I think it is over the heads of Myron Ebell’s intended audience, however, and it may be necessary to write an overview–at high-school level–that explains the relative value of inductive and deductive methods in science, use of multi-compartment models in general, the sorts of problems they regularly entail (formulation, measurement, n-body calculation, brute-force computer simulation, experimental repetition, real-world validation, emergent properties, catastrophic regime-shift, assignment of probabilities, etc.) and how these are variously or provisionally overcome, according to the science you are practicing.

It’s a tall order, but there are different ways in. For example, it is interesting that the “sound science” anti-climate model people are almost always willing to grant validation to the economic free-market model, which has exactly the same problems (if not more, since there we are also dealing with qualitative changes in information and motives).

Despite your stated intention to keep this website to strictly scientific concerns, and avoid the political and economic, you are going to be inevitably drawn into the wider discussion precisely through the public’s lack of understanding of the philosophy and methodology of science (in, for example, several of the comments above). Perhaps some of these complaints could be headed-off more efficiently by referring to some general “backgrounders.” Ecosystems science was rather replete with them, around thirty years ago.
Arthur Smith says

13 Jan 2005 at 2:29 PM

Physics has the same issues – I commented on this over at sciscoop a few weeks back. We’re always making approximations, and much of theoretical physics these days consists of models that don’t bear a lot of relation to reality, but have a mathematical form that we are confident matches reality in some respect. As long as your free parameters are fewer than the number of variables in what you’re matching to (and climate scientists are matching to long sequences of global and regional climate behavior) a model is a perfectly normal way of doing science. This fuss about climate modeling just amazes me.
Dano says

13 Jan 2005 at 4:13 PM

Re comment #20: [t]his fuss about climate modeling just amazes me – if there is no fuss raised, the Myron Ebells and the anti-modeling people in Lee’s excellent post [19] have no dog in the fight.

The intent is to sow doubt, not foster an environment for discussion.

And IMHO the last paragraph in 19 is absolutely necessary. A large fraction of the Murrican public cannot distinguish between science debate and policy debate [hence one of the reasons for this blog].

Best,

D
Tim says

13 Jan 2005 at 5:34 PM

The fuss is because it has direct political and economic implications and in no way reflects the scientific grounding of the matter.
John Dodds says

13 Jan 2005 at 7:39 PM

I have no problem with modelling as a tool of science. I have problems with the interpretations. For example, the temp curves of natural, anthropogenic & “all” given above show that the model is closer with all.

How do we know that it was the anthropogenics (commonly referred to as CO2 & the subject of the political Kyoto decision) that resulted in the closer estimation and not some competing/compensating errors in the natural model that do not show up until the 1970-2000 etc temp rises? (ie competing poor modeling of other factors?)

Put another way, I have problems with the idea that CO2 “causes” most/all of the recent warming, but even then we still seem to get drops in the temperature in spite of the increasing CO2. If we get temp drops every few years WHAT is causing them and apparently overshadowing the rises caused by the CO2 & associated feedbacks? Do we know that those factors out of the hundreds in the model are not poorly modelled? How?

[Response: The possibility of competing model errors accidentally producing the right result is a concern. This is lessened by similar results being produced by different GCMs, and by the “fit” being quite good over the different continents separately, rather than just in the global mean. As for the variability of the temperature, this is natural variability, and is expected, and is reproduced by the models. Indeed, the ability to reproude natural variability is one of the tests the models are expected to pass – William]
Eli Rabett says

13 Jan 2005 at 7:43 PM

Mnay here seem to think that GCMs behave like neural nets. Neural nets, which have been used for climate modeling, are the ultimate in parameter tweaking. Actually they are nothing but parameter tweaking. They are very good at telling you what will happen if things stay pretty much on the same course. Unfortunately they give you about zero insight into what is causing things to happen, and they are useless at a forking point.

OTOH, GCMs are constrained by the built in physics and chemistry. This is the lesson of ensemble forcasting, that if you keep any remaining parameterization within physically reasonable limits the results are strongly bounded.
Pat Neuman says

13 Jan 2005 at 10:48 PM

The last paragraph of Gavin’s “Is Climate Modelling Science?”
posted on 12 Jan 2005 is repeated below:

[So, in summary, the model results are compared to data, and if there is a mismatch, both the data and the models are re-examined. Sometimes the models can be improved, sometimes the data was mis-interpreted. Every time this happens and we get improved matches between them, we have a little more confidence in their projections for the future, and we go out and look for better tests. That is in fact pretty close to the textbook definition of science.]

From reading the paragraph, I conclude that prediction of climate using climate models is similar to prediction of river flood levels using hydrologic models. Operational hydrologic modeling requires precipitation, temperatures and other meteorological data for input. Calibration of hydrologic models is limited by the length of the historical river data. Accurate prediction of river stages and flows is accomplished by hydrologists that understand how the models work and what the limitations are likely to be. The hydrologist uses the latest operational data to update the model states to best represent true hydrologic conditions within each river basin. There are likely to be timing and volume errors in the modeling and predictions. With carefully made evaluations and adjustments of the latest observations, the model is honed to give the best prediction. It’s better not to make the model unecessarily complex, otherwise adjusting model states for observations becomes cumbersome with a possible loss in predictive capability. Data and model states must be scrutinized, and the hydrologists needs to consider possible changes in the river basin characteristics through time, which add complexity to calibration, modeling and prediction.
Toby Kelsey says

13 Jan 2005 at 10:57 PM

The use of “ensemble forecasting” (#15 and #23) presupposes that the number of tweakable parameters significantly exceeds that required for fitting the model. This contradicts Arthur’s belief (#20) in an excess of data over parameters. It would be useful to know which of these positions is correct for recent climate models.

The use of ensemble forecasting “strongly binds” the results/predictions (#23) only if you are confident you have captured all the correct relationships and functional forms and there are no additional significant relationships unaccounted for. Different model structures respond to overfitting differently.

Speaking of unaccounted relationships; are the effects of “global dimming” still controversial, or have they been incorporated into the standard climate models?

Whatever the validity of the currently accepted model (and my previous questions remain unanswered) one way to improve the rigour of the science would be to adopt an explicitly Bayesian approach, including priors for the model parameters and a transparent explanation of which data was used in fitting which component at each stage of building and adjusting the model.

I don’t wish to imply that climate scientists have not adequately considered these issues; just that clearer explanations of these points would help those of us outside the community understand the accuracy and limitations of the models better.

Unless I misunderstand him, I disagree with dave (#14) that the number of controls (parameters) doesn’t matter. Any model-building scientist should know about overfitting and understand the relationship of the number of parameters to the useability of a model.

[Response: The standard use of ensembles in transient climate simulations is because of the underlying chaotic dynamics of the system. This is an attempt to average over the ‘weather’ to understand the climate. ‘Ensembles’ in which different parameter choices are made in order to understand potential systematic biases are also undertaken (see for instance http://www.climateprediction.net ). The number of truly free tunable parameters in these models is actually quite small. – gavin]
Randolph Fritz says

14 Jan 2005 at 1:08 AM

By the way, the National Science Foundation’s Paleoclimate Program and NASA’s Earth Science Directorate have released an educational global climate modeler called EdGCM. You too can find out why working with physical models is real difficult & real science.

(Via Worldchanging.)
John Finn says

14 Jan 2005 at 6:13 AM

Response: The global cooling peaked at around 0.5 deg C in the second half of 1992, and temperatures returned to pre-Pinatubo levels by the end of 1994. See http://www.giss.nasa.gov/data/update/gistemp/GLB.Ts.txt for the actual data. -gavin

Thanks for this. This is more or less what I was after. I just wondered what annual global temps would have been without Pinatubo and whether 1992 (or 1993) would have been the ‘warmest year’ rather than 1998. A difference of 0.5 over a complete calendar year would obviously have that effect – but it looks as though this wasn’t the case.
CharlieT says

14 Jan 2005 at 11:38 AM

Yes, there is a problem with GCM and other models being described as scientific.
In themselves they are practically impossible to invalidate, it is only ever a subset of their parameter settings or sub-modules that can be shown to be false.

It seems really hard to identify a point at which Popper might be able to say this GCM is false. Could http://www.ncdc.noaa.gov/oa/ncdc_vtt_pwt.ppt be such a point, why not?

To make matters more difficult, even a simple mechanistic model (AFRC Wheat) containing good science in it sub-modules, struggles if fed less than perfect data|: http://www.nottingham.ac.uk/environmental-modelling/Roger%20Payne.pdf
Given the imperfect nature of real-world data, could AFRC wheat ever be shown to be wrong?

Models do seem to be accorded a supra scientific status that allows them to make incorrect predictions and yet never die from falsification
[Like horoscopes?]

I am not attacking modelling efforts indeed I think that there are very important , I’m commenting on the “modelling isn’t science” thought

[Response: You are making the mistake in thinking that a model is just one thing: right or wrong. This is a completely false dichotomy. The models are instead an approximation to the real world that have been shown to be reasonable over a wide range of conditions. When there is a data-model mismatch (as discussed above) we go back to the models and the data to see if there is something wrong, or some process that we have not accounted for. Sometimes these get resolved, and at any one time, there are a number of such mismatches outstanding. For instance, atmospheric models do not generally produce realistic Quasi-Biennial Oscillations in the tropical lower stratosphere. Since this is clearly a real-world phenomena, the models have been shown to be ‘false’. But unlike in your Popperian world, that doesn’t mean we throw the whole thing out. We instead examine the physics of the QBO in the real world and try and understand why the models don’t show this. It is most probably related to the vertical resolution and the lack of small scale gravity waves (but this is not my field, so I could be wrong). So people are working on how these effects are included in the model and are trying to make that more realistic. The problem with the Popperian view is that it does not take into account how complex present-day science is. Things are no longer as simple as ‘is the sun or the Earth the centre of the solar system’. Climate science is an evolving field, not a static one. – gavin]
CharlieT says

16 Jan 2005 at 5:05 PM

OK,
Are there some simple projections of the MSU lower troposphere temperature anomalies that we could watch come (mostly) true over the next year or two?

(I say MSU as those of us with a doubting background tend to like it)

I am not sure what sort of detail you would be comfortable with, but for interest’s sake monthly Northern and Southern Hemisphere would be good. But if it had to be annual, perhaps divide the world into eight or so equal segments?

If something similar exists already – could you post a link?

{Slightly off topic – If the global-dimming thing has some truth in it, then it could be part of the economic activity temperature story. Yet this might imply that even if M&M were completely right, the world could still in for a rough time ahead?
—A very stretched thought, I know}

[Response: Year to year variability is chaotic, and long term trends are small compared to this whether you look at MSU or the surface records. The only things worth watching for in the near term are potential improvements to the intercalibration problems with the MSU satellites that might influence the calculation of those long term trends. -gavin]
John Finn says

17 Jan 2005 at 1:07 PM

Gavin

Regarding the 3 graphs which are included in your article. I notice from the top graph that the models match (or more than match) the actual measured warming between – roughly 1910 and 1945. What dominant forcing factor do the models assume is responsible for this.

I notice from the link you gave me a bit back, i.e.

http://www.giss.nasa.gov/data/simodel/F_line.gif

that there is no net increase in forcing between 1850 and 1920 and yet a fairly noticeable rise in temperatures is already under way.

[Response: Between 1850 and 1920? Not really. It was colder in the 1890s, and the increase to the 1920s is due principally to the reduction of volcanic forcing, combined with a little bit of GHG and solar (and balanced by a small aerosol increase as well). Changes in forcings due to anthropogenic effects really don’t start to become dominant until later. -gavin]
JF says

17 Jan 2005 at 5:18 PM

Between 1850 and 1920? Not really. It was colder in the 1890s, and the increase to the 1920s is due principally to the reduction of volcanic forcing, combined with a little bit of GHG and solar (and balanced by a small aerosol increase as well

Sorry – my question was badly worded. I meant a fairly noticeable rise from about 1910 to 1945 – but I don’t see a comparative increase in the forcings. A slight increase from about 1925 perhaps – but this would seem to be attributable to GHG forcing which presumably would not be fully effective until about 1945 – when climate actually started cooling.
Bruce Frykman says

17 Jan 2005 at 7:52 PM

>”As William noted in his response, your definition of science would clearly exclude large areas of things that I certainly consider “science,” and that I think you would be uncomfortable excluding.”

>”I assume you’d not want to exclude seismology as not being “science”.

Aren’t most seismologists employed by oil companies?
I would suggest that they can indeed “predict” to a degree that makes their profession worthwhile, even among those who possess the ability to pass on the value of the professsion.

>”In fact, geology has long functioned in significant measure as a descriptive rather than predictive science.”

Again, the usefulness of this profession has to do with their ability to predict future outcomes; geologists are employed in the mining and civil engineering fields, not for their edifying prose, but for their demonstrated usefulness to predect satisfactory outcomes for some future human endeavor.

>”Models play a huge role. To follow your line of argument, you’d have to throw out vast areas of earth science.”

I believe predictive models are most useful when the things they model are understood to a level where the model demonstrably re-creates an underlying reality.

>”Likewise anthropology and archaeology, which do not attempt to predict future outcomes.”

Yes, but these fields are essentially history; they might invoke some tools of science to help lend credibility to the history (carbon-14 dating etc) but they do not represent the essense of science which makes it different from history (or religion); that is, the intrinsic disprovability of all of its tenents. All descriptions of the past, while possibly very credible, are not disprovable.

>”And ecology. And cosmology. All of these people are doing science, it seems to me. If they’re not doing science, they’re doing something else that’s awfully useful to our understanding of the world around us.”

Im not sure they are, I am quite sure they are familair with the tools of science but that’s not the same as “doing science”. Cosmology might be an intellectual pursuit but then so is religion. Cosmology, like religion, fails the disprovability test. I doubt that mankind has the sensory apparatus, the intellect, the temporal perspective, or the spacial perspective to properly contempalete the nature of the universe in its fullness. We do have the aformentioned qualities in suffecient quantity to solve some real problems provided we retain the ability to disprove any of our errant thought processes.

>”Simply defining them away as “unscientific” doesn’t make them any less useful.”

Apart from what I would call “talk over the wine”, I fail to see the usefulness of cosmology.
John Finn says

19 Jan 2005 at 11:30 AM

Gavin (or anybody else)

Can I suggest a short article on Forcing due to a doubling of CO2 and Climate Sensitivity. I’d be very interested in an explanation on the following

How is the forcing of 4 W/m2 (for a doubling of CO2) calculated. If it’s from alpha*ln(2) how is the alpha constant determined?

You say, elsewhere on this site, that climate sensitivity is 0.75 deg C per W/m2 – leading to a 3 (4 x 0.75) deg C increase in global temperatures. You also say that this 0.75 is justified because of ice age temperatures (-5 to -6 deg lower) and forcings (about -7 W/m2). I’d like to know

a) What forcings contributed to the lower temperatures.
b) What contribution did the reduction of Water Vapour make.
c) What was the earth’s albedo during the LGM – and how does that compare to now.
d) Are there any modern day observations which provide ‘validation’ for a sensitivity of 0.75 deg.

Thanks
Eli Rabett says

23 Jan 2005 at 2:03 AM

I can’t resist Bruce Frykman’s comment. As someone who has moved in his scientific career, from physics to chemistry, with the occasional dabble in biology, I notice huge differences in approach. When I started out, biology was completely descriptive, chemists didn’t believe that theory was useful and physicists were limited to what they could figure out with pencil and paper. Atmospheric science, such as it was, was barely quantitative, if that. It was only in the 80s that it was possible to move beyond one dimensional models. Today, even biology has a useful theoretical component where one can make predictions based on more than gut instinct.

I would suggest that many of us are carrying around outmoded models of scientific possibilities.
Lars says

24 Jan 2005 at 6:27 PM

Something else that he’s left out, Eli, is the ability of a rigorous historical science to test postdictions (for want of a better word) or corollaries of its theories. Palaeontologists test theories of evolutionary relationships by predicting the nature of intermediate forms that might (with luck) be found – a recent example is the feathered “dino-bird” from China, which corresponded quite nicely with Heilbrunner’s predicted precursor for Archaeopteryx from 1912 or so. Plate tectonics yields testable predictions about how fast and in what directions the continents are moving, which can and have been measured today. Cosmologists tested the Big Bang model by predicting a distribution of the universe’s cosmic background temperature, testable by the recent mapping of its proxy, the universe’s background microwave intensity.
Shannon Love says

24 Jan 2005 at 7:11 PM

“So, in summary, the model results are compared to data, and if there is a mismatch, both the data and the models are re-examined. Sometimes the models can be improved, sometimes the data was mis-interpreted.”

This is recipe for self-delusion.

Comparing models against existing historical data is the only means of verifying a models future predictive value. You are on solid ground there. However, changing the data that the model is tested against destroys the validity of of the test. One can be locked into a feedback loop wherein the model is altered to fit the data while the data is being altered to fit the model.

The data against which the model is to be tested should be agreed upon and locked in before the model is run. If the model does not reproduce the historical data then it fails, period.

Data should never be re-examined on the basis that it fails to correspond to an artificial model. Comparing our models against universally accessible measurements of nature is what separates science from other intellectual endeavors.

[Response: I’m not sure what planet you happen to be studying, but things are rarely as simple as you appear to believe. Problems with ‘data’ abound; due to spatially and temporally inhomogeneous networks, changing systematic biases, incorrect interpretations of proxy data, inappropriate interpolations, calibration drifts, etc. It is not ‘self-delusional’ when confronted with a discrepancy to look for reasons why the comparison might not be appropriate even though the model and data appear to be giving the same quantity. I did not state that in every case the model was correct, and the data wrong, but it has been known! – gavin]
Shannon Love says

25 Jan 2005 at 12:51 AM

gavin,

Data changes overtime but the data used to test each iteration of a model must be fixed before time. Any changes to the data should arise purely from observations entirely unrelated to the model. Otherwise, how can you ever be sure that the changes to the data weren’t unconsciously made to make the data conform to the model?

There are many instances of this happening in the sciences. We would want to guard against that here in climatology were we will use the models to decide whether millions live or die.

[Response: Remember though that it is not the modellers who go around changing the data. There are plenty of observationalists who are intriuged by apparent model-data mismatches who cheerfully leap into the fray and examine any possible sources of problems in the data. The MSU issue is a classic example. Their are at least three different groups of remote sensing specialists who have examined the MSU data and found problems with an incorrect correction of orbital drift, calibration issues with one of the satellites, sea ice related problems in the high latitudes etc. – gavin]
John Dodds says

26 Jan 2005 at 3:52 PM

Somewhere in the references I noticed a graph of the various forcings for the models, which shows (I think) an ever increasing GHG/CO2 forcing, and a much smaller oscillating solar forcing (sunspots perhaps?)
Question 1: Is there a forcing factor for orbital variations (Milankovitch factors)? Are these modelled or ignored as trivial?
Question 2: Are any other solar forcings except sunspots modelled?

[Response: the Milankovitch timescale is long and the forcing barely varies due to orbital changes over 100 years so no, they aren’t included (they would be for people modelling the last glacial maximum); solar forcing is modelled by change in total solar irradiance (probably as a total number; not sure if changes at different wavelengths are included) – William]
John Dodds says

26 Jan 2005 at 6:29 PM

Timing and averaging in the models for a typical 150 yr run:
The models give results as global average temps for a year etc. (see charts way above) What are the time increments that the models calculate/increment on? Is it a yearly average? or is it a shorter time?

Are annual variations in forcings modelled? or just a yearly average number- eg CO2 in Hawaii goes up an average of 2-3ppm/yr, but it varies by ~5ppm throughout the year. What is modelled?

[Response: Some of the answers are available in the glossary definition of GCM. But briefly, the models time step is usually around 20 or 30 minutes, so the annual numbers are averages over ~20,000 loops of the physics. Some forcings have important seasonal cycles (stratospheric ozone for instance) and that seasonality is inlcuded. The seasonality of CO2 is not very important radiatively, and currently is not included in the GISS model runs (but it may be at other institutes). – gavin]
Dan Hughes says

27 Jan 2005 at 10:06 AM

Are the results reported in this article: http://news.bbc.co.uk/1/hi/sci/tech/4210629.stm, to be considered part of the science of climate modelling? They seem to have been rushed to the press in a newspaper before peer-review.

The prediction by climate modelling scientists of an 11 C rise seems to be well outside of the range previously reported.

Generally, how do the investigations over at http://climateprediction.net fit into the overall scheme of the science of climate modelling?

Thanks

[Response: We have a commentary on this up. It is not as bad as reported! – gavin]
Bruce Frykman says

27 Jan 2005 at 10:54 AM

Highly respected peer reviewed “Nature” has published new findings that the latest climate models now predict “catastrophic” climate change which will put London under the sea and raise what’s left of England’s average temperature by 20 degrees C (36F).

This will make England’s summers hotter than Death Valley.

What can we do to avoid this “catastrophe”. Surely nothing we have ever faced, even nuclear warfare is this serious. Are there any sacrifices that are too dear to avoid the armageddon of climate change now proved beyond any reasonable doubt by the latest peer reviewed model.

And this all goes without saying that newer models “could” even predict the coming catastrophe to be 4 times worse than we now think.

[Response: See above – gavin]
Dlwer says

27 Jan 2005 at 11:39 AM

The results mentioned in #41 were not rushed to the press but rather published in Nature. Here is a link to the article on climateprediction.net. Quite a bit of fun to read the scientific article and then have a look how the popular press like BBC reports it ;)
Robin says

29 Jan 2005 at 9:58 AM

I’m intriged by this board. All those discussions about models and tweaking. For me it looks all nice and smooth. But the main concern I have is that models and simulations on the global climate are used as facts instead used as guidelines for further research. We use models (local climate) for medium range prediction for local weather and the best (spot-on) results are within 4 or 5 days. After that the chaotic behavior of the atmosphere will be too much out of the framework of the parameters that form the base of the simulation. In my opinion, our understanding of the climate as a whole and the sources driving the climate is less than required to form a decent working model. As tweaking a model to represent the observations can be done its not a guarantee that the model will improve to represent the real life. Therefore I think the models for the climate prediction we have now are as good as me telling the world that I’m sure by my model it will be 9.3 degrees celcius with 1.2mm rain and a 4.12 beafort wind coming from 87.6 degrees from magnetic north on 08h23m17s PM at January 6th 2012. There is still a lot to do gentlemen..
John Dodds says

2 Feb 2005 at 6:14 PM

Re #39 and 40, I am unclear as to what is modelled and what is ignored as trivial when modelling forcings & feedbacks.
For example:
Seasonal CO2 is NOT modelled at GISS, (Just curious, Is it location dependant, ie north or south of the equator or over sea or land, during the year?- I assume that the winds even it out fairly quickly to allow the use of average values for forcing.)
Milankovitch long cycles (413Kyrs to 20Kyrs) are not modelled for a typical 1850 to 2000 run.
Solar variations are typically(?) modelled as 10 year cycles with a magnitude of 0.4W/m2 (ref 2002 Hansen & Sato etc where I was directed by the GSM models glossary entry.)
For the 150 year model what is modelled for short term orbital fluctuations: eg daily rotation, monthly moon gravity effects on precession etc, yearly elliptical orbit effects for an assumed fixed elliptical orbit, yearly tilt and precession impacts etc?

Thanks for all the references to the model descriptions/validations etc above & in 39 & 40. Very useful for understanding the entire process.

John Dodds

[Response: Solar has a ~11 year component but also a long term trend (Lean, 2002), but obviously that is more uncertain. No short term orbital fluctuations are included (we don’t even have leap years). – gavin]
John Dodds says

9 Feb 2005 at 4:19 PM

Would you please correct any misperceptions that I have in the following paragraphs:

Based on the papers identified above and in the Glossary under GCM, and on the responses to questions 23, 39, 40 and 45 above, it seems to me that all the GISS GCM etc models for the 1850-2050 timeframe are based on the following assumptions:
1. Incoming solar energy is determined from an ~11 year Solar Irradiance cycle, which varies by less than a couple of Watts/m2 from the yearly mean.
2. There is no modelling of any orbital variations in incoming energy, either daily, yearly or long term Milankovitch variations, based on the assumption that a global yearly average value has a net zero change over the year which is imposed on the energy forcing at the TOA and the QFlux boundary etc. Year over year changes would be accounted for in the solar irradiance.
3 Variations in the CO2 forcing function (& presumably all the GHGs) are also based on a yearly global average value, so that there is also no daily or seasonal variation included in the models, let alone north -south variations. ( apparently the southern hemisphere CO2 air concentration actually follows the northern one by dropping several ppm in Nov-Apr (southern summer), rather than the seasonally imposed one in the north where it drops ~5ppm in the winter due to no photosynthesis.)

[Response: There is also a long term trend component in the solar irradiance based on Lean et al (2000). Milankovitch variability is ignored because it is extremely small over the time period 1850 to 2000. There is a hemispheric gradient in CH4 and N2O, but not in CO2 (which has simlar annual means in each hemisphere). -gavin]

Now the question that bothers me:
IF all the energy & GHG input forcing functions are yearly average values, then how does the model accurately account for time & temperature dependant feedback and physical processes? Since many of these processes result in non-symmetric time, location and temperature dependant feedbacks (eg water vapor, clouds, CO2 washout, condensation, ice formation, radiative and convective heat transfer etc) then how can a model that uses yearly average values for the forcings accurately reflect the results?
By averaging, doesn’t the model over/underestimate differing feedback and carryover effects from day to day, season to season, year to year, century to century? Surely they all don’t average out to no yearly change to the earth?
eg1 Every night the solar energy input, and reflected flux, which accounts for the daily 20 degree temp rise, goes to zero, but the earth keeps radiating energy based on its temperature.
eg2 Similarly there are seasonal temperature changes such that in winter the average temp is ~10-20 degrees lower than the mean, and likewise the northern winter CO2 air concentration is 5ppm lower, thus changing the energy flow balance in winter vs summer and day vs night & north vs south.

Thanks,

[Response: These feedbacks are indeed modelled because they depend not on the trace greenhouse gas amounts, but on the variation of seasonal incoming solar radiation and effects like snow cover, water vapour amounts, clouds and the diurnal cycle. The model has seasons and day and night very similar to in the real world. Only the impact of seasonal variations in CO2 (say) are not included – but they are a very small (tiny) part of the seasonal or diurnal variability. – gavin]
John Dodds says

9 Feb 2005 at 7:27 PM

I’m confused.
In 45 you said that you do not model any short term orbital effects.
In 46 you said that you have day/night and seasons like the real world.
How do you get day/night/seasons without modelling the orbital effects of earth rotation and travel round the assumed constant elliptical orbit?

Followup- are these day/night/seasonal variations forced on the mean solar irradiance so that the daily temperature in the model actually fluctuates 20 degrees or so? Or is it only done on every 5th time step when the energy subroutine is called?
Is this day/night/seasonality then a “forcing” since it is outside the earth physical model? (you really need to do your proposed discussion of forcing vs feedback!!)
and then what is the uncertainty? (since HansenSato 2002 does NOT give an uncertainty factor for orbital factors, just a 1951-2000 uncertainty of zero for solar irradiance)

Thanks

[Response: I presumed you meant short term variations in the orbital parameters (which we don’t include). If you really meant just the effect of the orbit (i.e. seasonal cycles, aphelion, perhelion etc.) all of those things are included. The solar zenith angle controls how much solar radiation impacts the top of the atmosphere and that is calculated every 30 minutes. So the models do have a comparable diurnal temperature range to the real world. The day/night/seasonality is not a forcing in itself, but it would be if it changed (i.e. because of Milankovitch). The orbital parameters are known extremely precisely, so there isn’t an uncertainty in their values. The solar irradiance forcing is given as 0.4 +/- 0.2W/m2 since 1850 (Fig 18, panel 1, Hansen et al, 2002 – note the the zero was not an uncertainty in panel 2, it was just what was put in that version of the model – i.e. they did not change solar then). Hope that’s clearer now.- gavin]
John Dodds says

11 Feb 2005 at 1:39 PM

Three questions:
1. What did you mean by the comment (#45) that the models do not even have leap years. Surely you account for ~365.25 days per year otherwise the incoming solar flux would be way underestimated, or if there are only 365.0 days in a year, after 150 years you are missing ~35 days and northern spring would start in February instead of March (which would be MAJOR global warming.)

2. Why is solar irradiance a forcing function, but earth rotation, & travel thru the orbit (ie changing earth sun distance, changing solar zenith angle, etc) is NOT a forcing function? They both result in variations in the incoming solar radiation to the modelled earth ecosystem.

3. Since Milankovitch factors are excluded as small, BUT they do exist and by ignoring them you are introducing an increasing underestimation of the incoming solar radiation ( & its impact on solar irradiance and on water vapor etc feedbacks), then why is there not an uncertainty estimate for this or better yet an actual estimate of what the under estimation is?
(eg the eccentricity HAS been decreasing for the last 20,000 years, hence resulting in us being closer to the sun, hence warmer every year. – which by the way is an argument for why the Ruddiman hypothesis for an “expected” ice age is not valid- we should be “expecting” a 40,000 year warm period similar to what was recently discovered at Vostok for the time ~400,000 years ago when we were last at this point in the eccentricity cycle!)

Appreciate the education.

[Response: 1. The model planet only takes 365 days to go around the sun. Not 365.25, therefore no leap years. This is only a very slight error. 2. A forcing is defined as something that is changing over the long term. So only if the orbital parameters (i.e. the eccentricity or obliquity) change do you define it as a forcing. Changes to solar irradiance over annual and longer time scales is included as a forcing, but shorter term variability (related to the 27 day solar rotation for instance), is not.
3. You only make a significant error in ignoring Milankovitch on the thousand year and longer timescale. The changes are otherwise just too small. – gavin]

Trackbacks

The Midpoint » Make it stop. says:

7 Oct 2007 at 10:26 PM

[…] But wait a minute! I thought computer models didn’t work when it came to predicting the effect of mankind on global warming! Are you saying they’re okay now? Okay, it’s good to know. As long as the message is consistent. (Of course, I’m just kidding here.) […]
Curiousities of the global warming debate at Speedkill says:

15 Apr 2008 at 11:25 PM

[…] future scenarios as they move into the past. If they don’t match, the models are changed. Real Climate explains this quite well (and they’re actually climate scientists, so that […]

Is Climate Modelling Science?

About Gavin

50 Responses to "Is Climate Modelling Science?"

Trackbacks

ABOUT

DATA AND GRAPHICS

INDEX

Realclimate Stats

About Gavin

Reader Interactions

50 Responses to "Is Climate Modelling Science?"

Trackbacks

Footer

ABOUT

DATA AND GRAPHICS

INDEX

Realclimate Stats