AI2Axiomatic AICambridgeFlatironUSCOct 20, 2025arXiv:2510.17960

AION-1: Omnimodal Foundation Model for Astronomical Sciences

L. Parker, François Lanusse, Jeff Shen, Ollie Liu, Tom Hehir, Leopoldo Sarra, Lucas Meyer, Micah Bowles, Sebastian Wagner-Carena, Helen Qu, Siavash Golkar, Alberto Bietti, Hatim Bourfoune, Nathan Casserau, Pierre Cornette, Keiya Hirashima, G. Krawezik, Ruben Ohana, Nicholas Lourie, Michael McCabe, Rudy Morel, Payel Mukhopadhyay, Mariel Pettee, Bruno Régaldo-Saint Blancard, Kyunghyun Cho, Miles Cranmer, Shirley Ho

AI Summary

The authors introduce AION-1, a family of large-scale multimodal foundation models for astronomy designed to integrate imaging, spectroscopic, and scalar data. AION-1 employs a two-stage architecture involving modality-specific tokenization followed by transformer-based masked modeling to process cross-modal token sequences. Pretrained on five large-scale astronomical surveys encompassing over 200 million observations, AION-1 demonstrates strong performance on downstream tasks such as galaxy property estimation, morphology classification, and spectral super-resolution with a frozen encoder.

Key Contribution

A single foundation model, AION-1, now handles everything from galaxy morphology to spectral super-resolution across diverse astronomical datasets.

Abstract

While foundation models have shown promise across a variety of fields, astronomy still lacks a unified framework for joint modeling across its highly diverse data modalities. In this paper, we present AION-1, a family of large-scale multimodal foundation models for astronomy. AION-1 integrates heterogeneous imaging, spectroscopic, and scalar data using a two-stage architecture: modality-specific tokenization followed by transformer-based masked modeling of cross-modal token sequences. The model is pretrained on five large-scale surveys: Legacy Survey, Hyper Suprime-Cam (HSC), Sloan Digital Sky Survey (SDSS), Dark Energy Spectroscopic Instrument (DESI), and Gaia. These span more than 200 million observations of stars, galaxies, and quasars. With a single frozen encoder, AION-1 achieves strong results on a broad suite of downstream tasks, including galaxy and stellar property estimation, galaxy morphology classification, similarity-based retrieval, galaxy image segmentation, and spectral super-resolution. We release AION-1 model variants ranging from 300 M to 3.1 B parameters. Beyond astronomy, AION-1 provides a scalable blueprint for multimodal scientific foundation models that can seamlessly integrate noisy, instrument-specific observations. All code, tokenizers, pretrained weights, and a lightweight evaluation suite are released under an open-source license.

Architecture Design (Transformers, SSMs, MoE)Multimodal Models Scientific Discovery & Drug Design

Citation Metrics

Citations3

Influential citations0

References67

Year2025

VenueN/A

Related Papers

Finding related papers...

Search

AION-1: Omnimodal Foundation Model for Astronomical Sciences

Related Papers