Over recent decades climate scientists have assembled a number of large datasets for the purpose of quantifying how climate has changed, and is changing, at the largest spatial and temporal scales, and for understanding the causes of those changes. It has become clear that these datasets need to be expanded and made more transparent to address future needs. Increasingly, it is important to understand climate change at regional and even smaller scales. There is a particular interest in extreme events, for which daily or even sub-daily data are needed to supplement existing data products, which are mostly available only at monthly time scales.
In February 2010 the international community* endorsed a proposal from the UK Met Office to hold a workshop, in order to start a process leading to the construction of new datasets, initially for land surface air temperatures. This workshop took place in September with participants including statisticians with environmental and economic expertise, as well as climate scientists from every continent and specialists from other disciplines such as metrology and software engineering.
The project is now being carried forward by a number of working groups, with statistical representation on several of these. The basic framework of the project is shown in the figure.

There are many Stage 0 data sources that are currently unavailable for analysis, either because they are not digitised, or for political reasons. Efforts will be made to make available such data (Stage 1) and convert them to a common format (Stage 2). In the case of digitising data, the idea of crowdsourcing, as has been used in astronomy, looks promising. The data in common format will be merged with existing holdings to form a common universal databank. This databank, supplemented by so-called metadata and artificial benchmark datasets, will be freely available.
Raw data can seldom be taken at face value. In addition to random recording errors, the vast majority of records have experienced systematic changes at some time in their histories. Changes can be abrupt, such as a change of site or instrument, or gradual such as urbanisation or vegetation growth. The metadata will include information about such changes, when available. Current practice is to apply various adjustments, so-called homogenisation, to account for known and unknown changes. Interpolation across data void regions is also frequently carried out. This project will allow institutions and individuals to access the databank and metadata to create quality controlled (Stage 4) and, finally, homogenised and possibly interpolated (Stage 5) datasets that can then be used to address the climate change questions of interest.
In the past homogenisation has not always been transparent. For this project any institution or individuals wishing their homogenised (Stage 5) data set to be included will need to fully document their quality control and homogenisation algorithms and have them objectively tested and assessed. A suite of artificial benchmarking datasets that replicate the structure of real climate data sets, with changes artificially introduced to mimic those most likely to occur in practice, will be created by a third party. Data set creators will be required to apply their algorithms to the benchmark data sets and the results will be recorded. It is hoped that a broad range of independently derived data sets will be created, with consistent assessments of their strengths and uncertainties.
Statisticians clearly have a key role to play in creation of data sets, interpolation, benchmarking, and in assessing uncertainty, as well as in the overall project direction. This is an international exercise with a long-term perspective, which will evolve with time. Stage 0 data sources that include variables other than surface temperatures will be digitised in full, with future expansion of the project in mind. There are likely to be many opportunities for statisticians to become involved.
Further details of the project can be found at www.surfacetemperatures.org/
Skip to Main Site Navigation / Login