Search papers, labs, and topics across Lattice.
This paper introduces an open-source mechanistic interpretability tool tailored for AI weather models, enabling analysis of internal latent representations via methods like cosine similarity and PCA. Applying this tool to GraphCast, the authors identify linear combinations of latent channels that correlate with meteorological phenomena like synoptic-scale waves and specific humidity. The tool facilitates understanding of how AI weather models generate predictions, addressing the black-box nature of these systems.
Unlock the secrets of AI weather models: a new tool reveals how latent representations encode interpretable meteorological features.
Artificial Intelligence (AI) weather models are improving rapidly, and their forecasts are already competitive with long-established traditional Numerical Weather Prediction (NWP). To build confidence in this new methodology, it is critical that we understand how these predictions are generated. This is a huge challenge as these AI weather models remain largely black boxes. In other areas of Machine Learning (ML), mechanistic interpretability has emerged as a framework for understanding ML predictions by analysing the building blocks responsible for them. Here we present an open-source, highly adaptable tool which incorporates concepts from mechanistic interpretability. The tool organises internal latent representations from the model processor and allows for initial analyses, including cosine similarity and Principal Component Analysis (PCA), enabling the user to identify directions in latent space potentially associated with meteorological features. Applying our tool to the graph neural network GraphCast, we present preliminary case studies for mid-latitude synoptic-scale waves and specific humidity. These demonstrate the tool's ability to identify linear combinations of latent channels that appear to correspond to interpretable features.