Atticus Geiger

Papers on Lattice

Total citations

Topics

h-index

Research focus

Interpretability & Mechanistic Interp (4)Architecture Design (Transformers, SSMs, MoE) (1)Open-Source Models & Weights (1)Reasoning & Chain-of-Thought (1)Natural Language Processing (1)

Frequent co-authors

Siddharth Boppana (2)Jack Merullo (2)Usha Bhalla (1)Usha Bhalla (1)

Papers (4)

Apr 30, 2026

Stanford HAIApr 30, 2026·also Northeastern, UCL

Do Sparse Autoencoders Capture Concept Manifolds?

Sparse autoencoders, despite their popularity for extracting interpretable features, often fail to capture the underlying manifold structure of concepts, instead fragmenting them across multiple, diluted features.

Usha Bhalla, Usha Bhalla, Thomas Fel +20

Architecture Design (Transformers, SSMs, MoE)Interpretability & Mechanistic Interp

Mar 5, 2026

Siddharth Boppana +10Mar 5, 2026

Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

LLMs often know the answer long before their "reasoning" suggests, wasting tokens on performative chain-of-thought.

Siddharth Boppana, S. Boppana, Annabel Ma +8

Interpretability & Mechanistic Interp Open-Source Models & Weights Reasoning & Chain-of-Thought

Feb 17, 2026

Aruna Sankaranarayanan +2Feb 17, 2026

Surgical Activation Steering via Generative Causal Mediation

Precisely steer LLM behaviors like refusal, sycophancy, and style transfer by surgically activating just a few key attention heads identified via Generative Causal Mediation.

Aruna Sankaranarayanan, Amir Zur, Atticus Geiger

Interpretability & Mechanistic Interp Natural Language Processing

Jun 12, 2025

Jun 12, 2025·also Google Research

Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization

Forget sparse autoencoders: semi-nonnegative matrix factorization directly dissects MLP activations into human-interpretable features that causally steer LLMs better.

Or Shafran, Atticus Geiger, Mor Geva

Interpretability & Mechanistic Interp

Search

Atticus Geiger

Research focus

Frequent co-authors

Papers (4)