Search papers, labs, and topics across Lattice.
3
5
7
2
Current research agents still struggle with retrieval robustness and hallucination control, even when evaluated in a static, verifiable research environment.
Debugging complex code agents just got easier: CodeTracer reconstructs full state transition histories, pinpointing failure origins and enabling recovery of failed runs.
MLLMs can be made significantly safer without sacrificing performance by disentangling risks in multimodal inputs and using RLAIF, outperforming even GPT-4V by 16% on safety benchmarks.