Search papers, labs, and topics across Lattice.
3
465
6
6
Reasoning models are surprisingly bad at controlling their own thoughts: Claude Sonnet 4.5 can control its chain-of-thought only 2.7% of the time, raising questions about the reliability of CoT monitoring.
GPT-5's real-time router learns to route queries to specialized models, making it faster and more useful than its predecessors.
Open-weight reasoning models now rival proprietary systems in agentic capabilities and benchmark performance, thanks to gpt-oss-120b and gpt-oss-20b.