Search papers, labs, and topics across Lattice.
4
6
8
13
The hardest AI tasks remain largely unsolved, with current models achieving only a 2.6% success rate on economically valuable workflows.
Success in long-horizon tasks hinges more on an agent's iterative persistence than on the quality of its initial solution.
The best LLM to answer a question isn't always the best LLM to *teach* the answer, and matching the "difficulty" of the explanation to the student's current abilities yields better learning.
Despite claims of safety alignment, state-of-the-art LLMs still spill the beans on hazardous scientific knowledge at an alarming rate, failing nearly 80% of the time on a new regulation-grounded benchmark.