Search papers, labs, and topics across Lattice.
IBM Research
1
0
3
2
HybridRAG-Bench reveals that existing benchmarks overestimate the reasoning abilities of retrieval-augmented LLMs due to contamination, offering a more realistic evaluation using up-to-date scientific knowledge.