“These results are quite strange”, my colleague told me. He analysed some of the recent climate model results from an experiment known by the cryptic name ‘CMIP5‘. It turned out that the results were ok, but we had made an error when reading and processing the model output. The particular climate model that initially gave the strange results had used a different calendar set-up to the previous models we had examined.
In fact, the models used to compute the results in CMIP5 use several different calendars: Gregorian, idealistic 360-day, or assuming no leap years. These differences do not really affect the model results, however, they are important to take into account in further analysis.
Just to make things more complicated, model results and data often come with different time units (such as counting hours since 0001-01-01 00:00:0.0) and physical units (precipitation m/day or kg m/s; temperature: Kelvin, Fahrenheit, or Centigrade). Different countries use different decimal delimiters: point or comma. And missing values are sometimes represented as a blank space, some unrealistic number (-999), or ‘NA’ (not available) if the data is provided as ASCII files. No recorded rainfall is often represented by either 0 or the ASCII character ‘.’.
For station data, the numbers are often ordered differently, either as rows or columns, and with a few lines in the beginning (the header) with various amount of description. There are almost as many ways to store data as there are groups providing data. Great!
Murphy’s law combined with typically different formats imply that reading data and testing takes time. Different scripts must be written for each data portal. The time it takes to read data can in principle be reduced to seconds, given appropriate means to do so (and the risk of making mistakes eliminated). Some data portals provide codes such as Fortran programs, but using Fortran for data analysis is no longer very efficient.
We are not done with the formats. There are more aspects to data, analyses, and model results. A proper and unambiguous description of the data is always needed so that people know exactly what they are looking at. I think this will become more important with new efforts devoted to the World Meteorological Organisation’s (WMO) global framework on climate services (GFCS).
Data description is known as ‘meta-data‘, telling what a variable represents, what units, the location, time, the method used to record or compute, and its quality.
It is important to distinguish measurements from model results. The quality of data is given by error bars, whereas the reliability of model result can be described by various skill scores, depending on their nature.
There is a large range of possibilities for describing methods and skill scores, and my guess is that there is no less diversity than we see in data formats used in different portals. This diversity is also found in empirical-statistical downscaling.
A new challenge is that the volume of climate model results has grown almost explosively. How do make sense out of all these results and all the data? If the results come with proper meta-data, it may be possible to apply further statistical analysis to sort, categorise, identify links (regression), or apply geo-statistics.
Meta-data with a controlled vocabulary can help keep track of results and avoid ambiguities. It is also easier to design common analytical and visualisation methods for data which have a standard format. There are already some tools for visualisation and analysis such as Ferret and GrADS, however, mainly for gridded data.
Standardised meta-data also allows easy comparisons between same type of results from different research communities, or different types of results, e.g. by the means of experimental design (Thorarinsdottir et. al. 2014). Such statistical analysis may make it possible to say whether certain choices lead to different results, if they are tagged with the different schemes employed in the models. This type of analysis makes use of certain key words, based on a set of commonly agreed terms.
Similar terms, however, may mean different things to different communities, such as ‘model’, ‘prediction’, ‘non-stationarity’, ‘validation’, and ‘skill’. I have seen how misinterpretation of such concepts has lead to confusion, particularly among people who don’t think that climate change is a problem.
There have been recent efforts to establish controlled vocabularies, e.g. through EUPORIAS and a project called downscaling metadata, and a new breed of concepts has entered climate research, such as COG and CIM.
There are further coordinated initiatives addressing standards for meta-data, data formats, and controlled vocabularies. Perhaps most notable are the Earth System Grid Federation (ESGF), the coupled model inter-comparison project (CMIP), and the coordinated regional downscaling experiment (CORDEX). The data format used by climate models, netCDF ‘CF‘, is a good start, at least for model results on longitude-latitude grids. However, these initiatives don’t yet offer explanations of validation methods, skill scores, modelling details.
Validation, definitions and meta-data have been discussed in a research project called ‘SPECS‘ (that explores the possibility for seasonal-to-decadal prediction) because it is important to understand the implications and limitations of its forecasts. There is also another project called VALUE that addresses the question of validation strategies and skill scores for downscaling methods.
Many climate models have undergone thorough evaluation, but this is not apparent unless one reads chapter 9 on model evaluation in the latest IPCC report (AR5). Even in this report, a systematic summary of the different evaluation schemes and skill scores is sometimes lacking, with an exception of a summary of spatial correlation between model results and analyses.
The information about model skill would be more readily accessible if the results were tagged with the type of tests used to verify the results, and the test results (skill scores). An extra bonus is that a common practice of including a quality stamp describing validation may enhance the visibility of the evaluation aspect. To make such labelling effective, they should use well-defined terms and glossaries.
There is more than gridded results from a regional climate model. What about quantities such as return values, probabilities, storm tracks, number of freezing events, intense rainfall events, start of a rainy season, wet-day frequency, extremes, or droughts? The larger society needs information in a range of different formats, provided by climate services. Statistical analysis and empirical-statistical downscaling provide information in untraditional ways, as well as improved quantification of uncertainty (Katz et al., 2013).
Another important piece of information is the process history to make the results traceable and in principle replicable. The history is important for both the science community and for use in climate services.
In summary,there has been much progress on climate data formats and standards, but I think we can go even further and become even more efficient by extending this work.
Update: Also see related Climate Informatics: Human Experts and the End-to-End System
- T. Thorarinsdottir, J. Sillmann, and R. Benestad, "Studying Statistical Methodology in Climate Research", Eos, Transactions American Geophysical Union, vol. 95, pp. 129-129, 2014. http://dx.doi.org/10.1002/2014EO150008
- R.W. Katz, P.F. Craigmile, P. Guttorp, M. Haran, B. Sansó, and M.L. Stein, "Uncertainty analysis in climate change assessments", Nature Climate Change, vol. 3, pp. 769-771, 2013. http://dx.doi.org/10.1038/nclimate1980