MIT CSAIL

MIT's Computer Science and Artificial Intelligence Laboratory. One of the largest and oldest AI labs in academia.

www.csail.mit.edu

Total papers

Total citations

470

Avg citations

Top Researchers

Antonio TorralbaTommi JaakkolaDaniela Rus

Recent Papers

Feb 12, 2026

Deakin University2d ago·affiliated lab: MIT CSAIL

Beyond Code: Empirical Insights into How Team Dynamics Influence OSS Project Selection

This paper investigates the influence of team dynamics on OSS project selection by surveying 198 OSS practitioners. The study reveals that communication-related team dynamics like responsiveness and clarity are consistently prioritized, but the relative importance varies based on contributor motivations such as gaining reputation or networking. The findings demonstrate that aligning team dynamics with contributor motivations is crucial for understanding project selection behavior and designing better project recommendation systems.

Empirically demonstrates that team dynamics, particularly communication-related aspects, significantly influence OSS project selection, with the relative importance of specific dynamics varying based on contributor motivations.

Shashiwadana Nirmani, Hourieh Khalajzadeh, Mojtaba Shahin2602.11692

Code Generation & Program SynthesisOpen-Source Models & WeightsRecommendation & Information Retrieval

Jan 23, 2026

3w ago

High-Rate Quantized Matrix Multiplication: Theory and Practice

This paper analyzes quantized matrix multiplication (MatMul) for efficient LLM deployment, considering both generic and weight-only quantization scenarios. It derives information-theoretic rate-distortion tradeoffs and benchmarks practical quantization schemes like absmax INT and floating-point against these limits, quantifying their rate loss. The authors then introduce "WaterSIC," a waterfilling-based quantization scheme for weight-only quantization that outperforms existing methods like GPTQ by adapting rate allocation to the covariance matrix, achieving near-optimal performance within 0.25 bits/entry of the information-theoretic limit.

Introduces WaterSIC, a novel waterfilling-based quantization scheme for weight-only matrix multiplication that achieves near-optimal rate-distortion performance and improves upon existing methods like GPTQ.

Or Ordentlich, Yury Polyanskiy2601.17187

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization

Jan 22, 2026

3w ago·affiliated lab: MIT CSAIL

Foundation models for electrocardiogram interpretation: clinical implications.

This study establishes SSL as a promising paradigm for ECG analysis, particularly in settings with limited annotated data, enhancing accessibility, generalizability, and fairness in AI-driven cardiac diagnostics across diverse clinical environments and questions.

A. Nolin-Lapalme, Achille Sowa, Jacques Delfrate +30

Jan 14, 2026

DreamWaQ++: Obstacle-Aware Quadrupedal Locomotion With Resilient Multimodal Reinforcement Learning

This paper introduces DreamWaQ++, a multimodal reinforcement learning framework that fuses proprioceptive and exteroceptive information for robust quadrupedal locomotion in complex environments. The approach trains a controller capable of agile navigation across challenging terrains like rough ground, steep slopes, and high stairs, while also exhibiting resilience to out-of-distribution scenarios. Key to the success is the fusion of proprioceptive feedback with exteroceptive data to enable obstacle avoidance and adaptive gait planning.

Introduces a resilient multimodal reinforcement learning framework, DreamWaQ++, that effectively fuses proprioception and exteroception for robust quadrupedal locomotion in challenging environments.

I. M. A. Nahrendra, Byeong-Uk Yu, Mi-Suk Oh +5

Robotics & Embodied AIMultimodal ModelsWorld Models & Planning

Dec 19, 2025

Dec 19, 2025·affiliated labs: MIT CSAIL, Mila, Tsinghua AI

OpenAI GPT-5 System Card

The paper introduces GPT-5, a unified system comprising a fast, general-purpose model and a deeper reasoning model, managed by a real-time router trained on user feedback and performance metrics. GPT-5 demonstrates improved performance on benchmarks, faster response times, and enhanced utility for real-world queries, with significant reductions in hallucinations, improved instruction following, and minimized sycophancy. The system incorporates "safe-completions" for safety and is treated as High capability in the Biological and Chemical domain under OpenAI's Preparedness Framework, triggering associated safeguards.

Introduces a unified GPT-5 system with a real-time router that dynamically selects between a fast, general-purpose model and a deeper reasoning model based on query characteristics, optimizing for speed and accuracy.

Aaditya K. Singh, Adam Fry, Adam Perelman +479622601.03267

Reasoning & Chain-of-ThoughtTool Use & AgentsEval Frameworks & Benchmarks

Dec 7, 2025

Dec 7, 2025·affiliated labs: Stanford HAI, MIT CSAIL, Berkeley AI Research (BAIR), Tsinghua AI

International AI Safety Report 2025: Second Key Update: Technical Safeguards and Risk Management

The International AI Safety Report 2025's Second Key Update analyzes the current state of AI risk management and technical mitigations employed by researchers, companies, and governments. It highlights advancements in training safer models and monitoring outputs while acknowledging uncertainties in the effectiveness of these measures and their variability across applications. The report aims to inform policymakers, researchers, and the public about progress and remaining gaps in AI safety.

Synthesizes recent developments in AI risk management and technical risk mitigation strategies, identifying both progress and persistent gaps in ensuring the safety of general-purpose AI systems.

Y. Bengio, Stephen Clare, Carina Prunkl +34

Constitutional AI & AI EthicsRed-Teaming & Adversarial RobustnessEval Frameworks & Benchmarks

Oct 31, 2025

Oct 31, 2025·affiliated lab: MIT CSAIL

DLDC: A Dual Loop Data Cleaning Method for Fine-Tuning Remote Sensing Image Generative Models

The paper introduces a Dual Loop Data Cleaning (DLDC) method to automatically generate high-quality remote sensing image-text training data by leveraging contrastive multimodal quality evaluations. DLDC uses an external generation loop (EGL) based on a multimodal foundational model for layout description and an internal evaluation loop (IEL) based on contrastive learning metrics to assess image-text matching. Fine-tuning T2I models with the cleaned dataset results in significant improvements in image generation quality, as evidenced by substantial reductions in FID and increases in CLIP and RemoteCLIP scores, and improved downstream segmentation performance.

Introduces a dual-loop data cleaning method (DLDC) that automatically generates high-quality remote sensing image-text training data, eliminating the need for manual annotation.

Tian Xing, Hu Yan, Xinwei Wang +4

Data Curation & Synthetic DataMultimodal ModelsComputer Vision

Aug 8, 2025

Aug 8, 2025·affiliated lab: MIT CSAIL

gpt-oss-120b&gpt-oss-20b Model Card

This paper introduces gpt-oss-120b and gpt-oss-20b, two open-weight reasoning models built using a mixture-of-experts transformer architecture and trained via large-scale distillation and reinforcement learning. These models are optimized for agentic capabilities, including research browsing and tool use, and utilize a chat format for instruction following. The authors demonstrate strong performance on mathematics, coding, and safety benchmarks and release the model weights and related resources under an Apache 2.0 license.

Introduces and releases the weights for gpt-oss-120b and gpt-oss-20b, two open-weight reasoning models with strong agentic capabilities and performance across diverse benchmarks.

OpenAI Sandhini Agarwal, Lama Ahmad, Jason Ai +1214032508.10925

Architecture Design (Transformers, SSMs, MoE)Open-Source Models & WeightsTool Use & Agents

May 31, 2025

ChartGen: Scaling Chart Understanding Via Code-Guided Synthetic Chart Generation

The paper introduces ChartGen, a fully automated pipeline for generating synthetic chart image-code pairs to improve chart understanding in vision-language models (VLMs). ChartGen leverages a VLM to reconstruct seed chart images into Python scripts and then uses a code-oriented LLM to iteratively augment these scripts, creating a diverse dataset. The authors generated 222.5K unique chart image-code pairs and used a held-out evaluation set to benchmark six open-weight VLMs, demonstrating significant room for improvement in chart-to-code reconstruction.

Introduces ChartGen, a novel code-guided synthetic chart generation pipeline that significantly expands the availability of chart image-code pairs for training and evaluating VLMs.

Jovana Kondic, Pengyuan Li, Dhiraj Joshi +122507.19492

Code Generation & Program SynthesisMultimodal ModelsData Curation & Synthetic Data

Mar 5, 2025

Mar 5, 2025·affiliated lab: MIT CSAIL

Foundation models for generalizable electrocardiogram interpretation: comparison of supervised and self-supervised electrocardiogram foundation models

The authors developed and compared two open-source foundation models for ECG interpretation: DeepECG-SSL, a self-supervised model pretrained with contrastive learning and masked lead modeling, and DeepECG-SL, a supervised model. Both models were trained on over 1 million ECGs to predict 77 cardiac conditions and were evaluated on multiple datasets for ECG interpretation and digital biomarker tasks. DeepECG-SSL outperformed DeepECG-SL on digital biomarker tasks with limited labeled data, demonstrating the potential of self-supervised learning for ECG analysis, while both models showed minimal performance disparities across age and gender.

Demonstrates the efficacy of self-supervised learning for ECG analysis, particularly in low-data regimes, by developing and evaluating DeepECG-SSL, an open-source foundation model that outperforms its supervised counterpart on digital biomarker tasks.

A. Nolin-Lapalme, Achille Sowa, Jacques Delfrate +30

Open-Source Models & WeightsTraining Efficiency & OptimizationScientific Discovery & Drug Design

Feb 27, 2025

Modular Approaches to Complex Reasoning in Visual Question Answering Systems

This paper introduces a VQA system leveraging Visual BERT, ViLT, cross-modal memory networks, memory-augmented attention, and vision-language pre-training models (Flamingo, BLIP) for improved multimodal fusion and dynamic memory retrieval. The system addresses complex reasoning by adapting to novel question types through few-shot learning. Experiments on VQA v2.0 demonstrate 80% accuracy, surpassing LSTM-CNN and attention-only baselines, alongside improved BLEU scores and precision-recall metrics.

Demonstrates a modular VQA architecture that integrates multiple deep learning techniques to achieve state-of-the-art performance on complex reasoning tasks.

Akshay Bhosale, Sujit Wandre, Sagar Chavan +2

Multimodal ModelsReasoning & Chain-of-ThoughtComputer Vision

Feb 8, 2025

Assam UniversityFeb 8, 2025·affiliated lab: MIT CSAIL

Combining Convolutional Neural Networks with Reinforcement Learning for Autonomous Robotics

This paper introduces an Adaptable Reinforcement Learning-oriented Multifaceted Data Combination (AdRL-MDC) system to train a robotic hand for gaming, aiming to improve accuracy and consistency in motion management. The system integrates an adaptable training process for ensemble classification, a reinforcement learning paradigm for robot intelligence, and a multifaceted data combination framework. Experimental results demonstrate that the CNN-based ensemble framework achieves high accuracy with efficient computation, and the depth vision-oriented CNN classification algorithm attains 100% recognition accuracy.

Introduces an adaptable reinforcement learning framework (AdRL-MDC) that combines CNNs and RL to achieve high accuracy and robustness in robotic hand motion control for gaming.

Mrutyunjay Padhiary, Swati Powar, Anju Asokan +3

Robotics & Embodied AIComputer VisionRLHF & Preference Learning

Lattice is designed for desktop

MIT CSAIL

Top Researchers

Recent Papers

Search