Search papers, labs, and topics across Lattice.
4
7
7
7
LLM agent progress increasingly hinges on better external cognitive infrastructure, not just stronger models.
Personal photo retrieval isn't just about visual similarity; PhotoBench reveals that current models fail to leverage the rich context of our lives鈥攖ime, place, people鈥攏eeded to truly understand our search intent.
LMM-based GUI agents stick out like a sore thumb in human-centric mobile environments, but simple techniques can make them blend in without sacrificing utility.
Current LLM evaluation benchmarks often conflate chatbots and true AI agents, leading to misaligned research efforts, but this survey provides a framework for targeted evaluation based on environmental complexity and agent capabilities.