Student Dissertation or Thesis

On the theory and optimal design of emulators for climate impact assessment

Womack, C.B. (2026)
PhD Thesis, MIT Department of Aeronautics and Astronautics

Abstract / Summary:

Abstract: Earth System Models (ESMs) are our most comprehensive tools for projecting future climate impacts across the land, ocean, and atmosphere, yet their extreme computational costs limit their ability to survey the vast space of potential emissions trajectories. Climate emulators—reduced-order models that reproduce the statistics of these full-scale models in a fraction of the time—are 
poised to fill this scenario-assessment gap. Despite the rapid uptake of emulators in many domains, 
motivated in part by the broader machine learning (ML) revolution, many questions around their 
theoretical underpinnings, physical consistency, and ultimate utility for areas like impact 
assessment remain.

In this thesis, we first address the lack of a comprehensive theoretical basis for emulators by developing a framework that enables fundamental methodological comparisons. This framework connects disparate emulation techniques via ideas from statistical mechanics and stochastic calculus, and we apply it to understand potential sources of emulator error, focusing on memory effects, hidden variables, system noise, and nonlinearities. We discuss optimal use cases for a number of emulation techniques in light of these potential sources of error, along with implications for ESMs based on our pedagogical model results. Based on these findings, we then address emulator physical consistency and extrapolative skill. While efforts to improve emulator generalizability typically focus on the design of more complex ML architectures, we show that the training data itself is a major bottleneck for predictive skill. We introduce a method to generate optimal training data by iteratively updating an initial emissions trajectory to maximize emulator skill, showcasing applications to simple and intermediate-complexity climate models. An emulator trained on just one or two of these optimized scenarios outperforms one trained on six standard ScenarioMIP pathways. We achieve higher predictive skill despite training on a smaller dataset, and find that our emulators successfully isolate the distinct physical behaviors of different climate forcing agents (e.g., greenhouse gases vs. aerosols) without training on single-forcing runs. 

To support these theoretical and methodological improvements, we conclude by applying a novel, generative AI climate emulator to capture compound climate hazards like wet-bulb temperature. By coupling the MIT Emissions Predictions and Policy Analysis model to this emulator, we rapidly generate realizations of spatially- and cross-correlated climate fields. We utilize this framework to assess local sensitivities to various emissions scenarios, including an early assessment of the projected ScenarioMIP-CMIP7 protocol. Furthermore, we demonstrate that temperature overshoot pathways result in substantially higher cumulative heat stress risks compared to stabilization pathways with similar end-of-century outcomes. The improvements presented herein democratize access to computational science and detailed climate projections, enabling the probabilistic assessment of compound climate hazardsessential for robust adaptation planning.

Citation:

Womack, C.B. (2026): On the theory and optimal design of emulators for climate impact assessment. PhD Thesis, MIT Department of Aeronautics and Astronautics