Search papers, labs, and topics across Lattice.
New York University Redis
2
0
2
High PR-AUC scores can mislead model selection, as they often correlate with poor real-world performance in semantic caching scenarios.
LLM truthfulness is more nuanced than we thought: a new knowledge graph benchmark reveals that hallucination rates vary significantly depending on the breadth and depth of knowledge required.