Taking climate emulators to the next level

MIT CS3-affiliated Aerospace Computational Engineering PhD student Chris Womack is working to advance more efficient, accurate, and reliable climate emulators. (Source: Earthday.org)

Research Insight: Advancing more efficient, accurate and reliable climate emulators

A perspective by MIT CS3 PhD student Chris Womack

This Research Insight explores recent efforts to take climate emulators to the next level.

Science

Used for decades to assess the likely impacts of anthropogenic climate change on our world, full-scale Earth-system models provide the most accurate physical representation of the climate system based on our current knowledge. But these models are incredibly computationally expensive, taking weeks to months to run on supercomputing clusters. That's really not feasible when we consider that there are potentially thousands of plausible future scenarios to explore. The idea behind making climate emulators is to create the most lightweight possible model that can accurately and reliably reproduce the outputs of those big models for a fraction of the computational cost. So how can we make the best emulator possible—i.e., the most efficient, accurate and reliable system for giving decision-makers the power to analyze how a potential policy might impact, say, precipitation or temperature 50-100 years from now? 

Computational Tools

There are three aspects of how I and my colleagues at MIT have approached this problem. First, we have incorporated domain expertise—knowledge of Earth-system physics—into climate emulators, in addition to the standard data (e.g., surface temperatures based on emissions levels) generated by full-scale Earth-system models. In our paper “Rapid emulation of spatially resolved temperature response to effective radiative forcing,” we used thermodynamic relationships to make these emulators more realistic and thereby improve their performance. Likewise, in “A theoretical framework to understand sources of error in Earth-system model emulation,” we incorporated physical principles to improve climate emulator projections. The second aspect of our approach has been to improve the machine-learned or AI architectures behind these techniques. A  study led by CS3 affiliates used generative AI to produce a new type of emulator that allows us to very quickly sample the uncertainty in future climates, improving our capability to predict not only what will happen on average, but also on the extremes.  

The third aspect of our approach is to improve the data that we use to train climate emulators. Currently, researchers and practitioners typically use a limited set of Shared Socioeconomic Pathways (SSPs) which represent different levels of global climate mitigation ambition, and for which we have the most data available. What I found in the theoretical framework study is that these SSPs are not necessarily the best choice to train on. My current work aims to determine what that best choice is. To do that, I have developed a methodology to iteratively update a training dataset to produce the most skillful, reliable emulator possible. The datasets we’re producing contain more information than the SSPs and can therefore be used to train a more skillful emulator. We've shown that this works well for a simple climate model, and for one of intermediate complexity (the MIT Earth System model (MESM), which was developed by and is used extensively in CS3). The next step is to run these scenarios with a full-scale climate model to demonstrate that they work at that scale.

Strategy

By enabling more accurate and reliable emulators than what we have currently, we can upgrade the toolset used to inform decision-makers. If an emulator can reproduce all the outputs of a full-scale Earth-system model, then one can use it to explore the implications of many different, plausible scenarios without ever needing to access a full-scale model. That said, we're always going to need those full-scale models. We need them not only to generate data for the emulators but also to explore problems that a purely data-driven technique will never be able to solve. If we use emulators to run standard scenarios, we can retarget full-scale models to study uncertainties beyond the SSPs, such as the triggering of climate tipping points. This new generation of emulators is about freeing up computational resources so that the full-scale models can focus on what they are good at.