Search papers, labs, and topics across Lattice.
1
0
2
Even state-of-the-art LLMs like GPT-5.2 falter in LakeQA, scoring just 18.37% on a benchmark that demands both searching and multi-hop reasoning.