Search papers, labs, and topics across Lattice.
Shanghai Jiao Tong University, *Equal Core Contributions, #Project Lead
3
0
8
Forget dumb context stuffing: LongSeeker shows that strategically *editing* its own memory lets agents solve web search tasks with far greater reliability.
A new dataset, SeIQA, offers a benchmark to evaluate how humans perceive semantic loss in degraded images, pushing beyond traditional quality metrics.
LLMs can verify code more effectively by focusing on test case utility rather than sheer quantity, achieving a 28.5% higher mutation score with 19.3% fewer tests.