Search papers, labs, and topics across Lattice.

AI safety company building reliable, interpretable, and steerable AI systems. Creator of Claude.
9
12
1
Reasoning models are surprisingly bad at controlling their own thoughts: Claude Sonnet 4.5 can control its chain-of-thought only 2.7% of the time, raising questions about the reliability of CoT monitoring.
Current AI benchmarks miss the crucial effects of AI R&D automation, so here are the metrics we should be tracking instead.
Model handoffs in multi-turn LLM systems can swing performance by up to 13 percentage points, revealing a hidden reliability risk that single-model benchmarks miss.
LLMs harbor surprisingly consistent hidden beliefs on sensitive topics like mass surveillance and torture, even when direct questioning suggests otherwise.
Most AI models are failing to disclose critical safety information like deception behaviors and hallucination risks, even from top labs.
Claude 2 can match the performance of top medical specialists on pulmonary thromboembolism knowledge assessments, suggesting AI's potential for clinical decision support.
Despite their promise, even the best multimodal LLM (GPT-4o) achieves only 26% accuracy in grading knee osteoarthritis from radiographs, revealing a significant gap in clinical reliability.
AI-generated feedback on student portfolios from GPT-4o and Claude-Sonnet-4 shows promise for high-stakes clinical assessments, but careful evaluation is needed to ensure accuracy and educational value.
LLMs can generate plain language summaries of scientific research that are as good as human-written ones, but easier to read.