MIT CSAIL

×Training Efficiency & Optimization

14 papers from MIT CSAIL on Training Efficiency & Optimization

Apr 21, 2026

MIT CSAILApr 21, 2026·also NYU

An Efficient Black-Box Reduction from Online Learning to Multicalibration, and a New Route to $Φ$-Regret Minimization

Forget complex fixed-point machinery: this work offers a dramatically simpler and more efficient route from external regret to $Φ$-regret minimization.

Juan Carlos Perdomo

Natural Language Processing Training Efficiency & Optimization

Apr 13, 2026

MIT CSAILApr 13, 2026·also Department of Industrial and Systems, Operations Research Center

Computation of Least Trimmed Squares: A Branch-and-Bound framework with Hyperplane Arrangement Enhancements

Exact robust regression at scale is now possible: a new algorithm solves the NP-hard Least Trimmed Squares problem orders of magnitude faster than existing methods.

Xiang Meng, Andrés Gómez, Rahul Mazumder

Scientific Discovery & Drug Design Training Efficiency & Optimization

Apr 10, 2026

MIT CSAILApr 10, 2026·also Mirror Physics

EquiformerV3: Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers

EquiformerV3 achieves state-of-the-art performance in atomistic modeling by combining architectural improvements with optimized software, enabling accurate energy-conserving simulations.

Yi-Lun Liao, Alexander J. Hoffman, Sabrina C. Shen +3

Architecture Design (Transformers, SSMs, MoE)Scientific Discovery & Drug Design Training Efficiency & Optimization

Apr 9, 2026

MIT CSAILApr 9, 2026·also Harvard

Generative optimal transport via forward-backward HJB matching

Forget simulating backward dynamics: solve stochastic optimal control problems by just watching the system relax forward.

Haiqian Yang, Vishaal Krishnan, S. Sinha +2

Scientific Discovery & Drug Design Training Efficiency & Optimization

Apr 8, 2026

MIT CSAILApr 8, 2026·also UMass, UMich

Fast Spatial Memory with Elastic Test-Time Training

Stabilizing test-time training with an elastic prior lets you reconstruct 4D scenes from long video sequences without catastrophic forgetting, even with smaller memory chunks.

Ziqiao Ma, Xueyang Yu, Haoyu Zhen +2

Computer Vision Training Efficiency & Optimization

MIT CSAILApr 8, 2026·also Laboratory for Information and Decision

Weaves, Wires, and Morphisms: Formalizing and Implementing the Algebra of Deep Learning

Finally, a rigorous mathematical framework lets you treat deep learning architectures as composable algebraic objects, opening the door to formal verification and automated design.

Vincent Abbott, Gioele Zardini

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization

Apr 6, 2026

MIT CSAILApr 6, 2026·also Ben-Gurion University of the Negev

Faster Superword Tokenization

Training superword tokenizers just got 600x faster, unlocking practical use of subword tokenization across pre-tokenization boundaries.

Craig W. Schmidt, Chris Tanner, Yuval Pinter

Natural Language Processing Training Efficiency & Optimization

Mar 30, 2026

Mar 30, 2026·also MIT CSAIL

Using Games to Learn How Large Language Models Work

Demystifying LLMs for the masses might be as simple as turning their mechanics into a game.

Allison Chen, Isabella Pu

Data Curation & Synthetic Data Natural Language Processing Training Efficiency & Optimization

Mar 18, 2026

MIT CSAILMar 18, 2026

Rapid Neural Network Prediction of Linear Block Copolymer Free Energies

Neural networks can accurately predict polymer free energies, even when traditional methods like Bennett Acceptance Ratio fail due to poor phase-space overlap.

Ian Chen, Alfredo Alexander-Katz

Scientific Discovery & Drug Design Training Efficiency & Optimization

Mar 9, 2026

Mar 9, 2026·also MIT CSAIL, Max Planck, Tuebingen AI Center/University of Tuebingen, University of Siegen +1

MM-TS: Multi-Modal Temperature and Margin Schedules for Contrastive Learning with Long-Tail Data

By dynamically adjusting contrastive learning temperatures based on data density, MM-TS achieves state-of-the-art results on multimodal long-tail datasets.

Siarhei Sheludzko, Dhimitrios Duka, Bernt Schiele +2

Data Curation & Synthetic Data Multimodal Models Training Efficiency & Optimization

Mar 3, 2026

MIT CSAILMar 3, 2026·also Albert Einstein Center, Columbia, Fermi National Accelerator Laboratory, NSF

Variance reduction in lattice QCD observables via normalizing flows

Lattice QCD calculations just got a whole lot faster: normalizing flows slash variance by up to 60x in key observables.

Ryan Abbott, Denis Boyda, Daniel C. Hackett +4

Scientific Discovery & Drug Design Training Efficiency & Optimization

Feb 25, 2026

MIT CSAILFeb 25, 2026

Asymptotically Fast Clebsch-Gordan Tensor Products with Vector Spherical Harmonics

E(3)-equivariant networks just got a whole lot faster: a new algorithm cuts the complexity of Clebsch-Gordan Tensor Products from $O(L^6)$ to $O(L^4\log^2 L)$ without sacrificing completeness.

YuQing Xie, Ameya Daigavane, Mit Kotak +1

Architecture Design (Transformers, SSMs, MoE)Scientific Discovery & Drug Design Training Efficiency & Optimization

Feb 17, 2026

MIT CSAILFeb 17, 2026·also CNRS

Machine learning electronic structure and atomistic properties from the external potential

Ditch the geometry-to-property map: this work uses the external potential as the primary input for machine learning models, unlocking a scalable and equivariant approach to predicting electronic structure.

Jigyasa Nigam, Tess Smidt, Geneviève Dusson

Scientific Discovery & Drug Design Training Efficiency & Optimization

Mar 5, 2025

UWMar 5, 2025·also Mila, MIT CSAIL, Cardiovascular Research Center, Eastern New Mexico Medical Center +3

Foundation models for generalizable electrocardiogram interpretation: comparison of supervised and self-supervised electrocardiogram foundation models

Self-supervised learning beats supervised learning for ECG interpretation when labeled data is scarce, unlocking more robust and generalizable AI-driven cardiac diagnostics.

A. Nolin-Lapalme, Achille Sowa, Jacques Delfrate +30

Open-Source Models & Weights Scientific Discovery & Drug Design Training Efficiency & Optimization

Search

MIT CSAIL