Lattice AI Research

Research focus

Eval Frameworks & Benchmarks (2)Reasoning & Chain-of-Thought (2)Tool Use & Agents (2)Red-Teaming & Adversarial Robustness (1)

Frequent co-authors

Ian Kivlichan (2)Micah Carroll (2)Aidan McLaughlin (2)Alec Helyar (2)

Papers (3)

Mar 5, 2026

AnthropicMar 5, 2026·also Google Research

Reasoning Models Struggle to Control their Chains of Thought

Reasoning models are surprisingly bad at controlling their own thoughts: Claude Sonnet 4.5 can control its chain-of-thought only 2.7% of the time, raising questions about the reliability of CoT monitoring.

Chen Yueh-Han, Robert McCarthy, Bruce W. Lee +4

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Red-Teaming & Adversarial Robustness

Dec 19, 2025

OpenAIDec 19, 2025·also Anthropic, BAIR, Mila, MIT CSAIL +6

OpenAI GPT-5 System Card

GPT-5's real-time router learns to route queries to specialized models, making it faster and more useful than its predecessors.

Aaditya K. Singh, Adam Fry, Adam Perelman +47962

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Tool Use & Agents

Aug 8, 2025

OpenAIAug 8, 2025·also Anthropic, MIT CSAIL, BUPT, Imperial +1

gpt-oss-120b&gpt-oss-20b Model Card

Open-weight reasoning models now rival proprietary systems in agentic capabilities and benchmark performance, thanks to gpt-oss-120b and gpt-oss-20b.

OpenAI Sandhini Agarwal, Lama Ahmad, Jason Ai +121403

Architecture Design (Transformers, SSMs, MoE)Open-Source Models & Weights Tool Use & Agents

Search

Bowen Baker

Research focus

Frequent co-authors

Papers (3)