Search papers, labs, and topics across Lattice.
2
0
5
Diffusion models can efficiently sample lookahead action sequences for active search, outperforming traditional tree search while mitigating optimism bias.
LLMs can turn sparse rewards into dense training signals for RL agents, achieving comparable performance with significantly higher sample efficiency.