Search papers, labs, and topics across Lattice.
12 papers from Meta AI (FAIR) on Reasoning & Chain-of-Thought
Achieve significant reasoning gains in frozen LLMs (+22.4%) without retraining by adaptively routing reward model guidance at the token level during inference.
On-policy reward modeling with LLM judges not only unlocks significant performance gains on complex mathematical reasoning tasks, but also generalizes to improve performance on simpler numerical and multiple-choice benchmarks.
LLMs can now infer plausible stage layouts from unstructured text alone, opening up new possibilities for automated media production.
LLMs struggle to generate diverse and specific connections between concepts, even with high token budgets and "thinking" prompts, revealing a gap in creative associative reasoning.
LLM reasoning research is inadvertently paving a dangerous path towards AI situational awareness and strategic deception, demanding a re-evaluation of current safety measures.
Even the best open-weight LLMs still fail on nearly two-thirds of questions requiring reasoning over scientific tables, highlighting a persistent "execution bottleneck" in translating strategy to action.
LLMs can ace math problems while reasoning like a drunk toddler, with 82% of correct answers arising from unstable, inconsistent logic.
Instruction-following in large reasoning models gets a serious upgrade with RAIN-Merging, a gradient-free technique that merges in instruction-tuned capabilities without wrecking the model's ability to think step-by-step.
By surgically intervening in MLLM decoding, this work cuts hallucination rates without sacrificing descriptive quality, a feat prior methods struggled to achieve.
Escape the bottleneck of translating product intent into ranking system hypotheses: GEARS offers an agentic framework that autonomously discovers and validates superior ranking policies.
Unlock superhuman visual reasoning in multimodal models by simply giving them the ability to think step-by-step at test time.
Forget scaling laws: this work shows you can get SOTA reasoning from sub-billion parameter models with *less* data, if you're smart about curation and resampling.