Search papers, labs, and topics across Lattice.
3
0
5
0
Directly embedding quantile tokens into input sequences leads to sharper and more accurate distribution predictions, outperforming traditional methods by a substantial margin.
Freezing most of your critic network and only training a tiny LoRA adapter can dramatically improve off-policy RL performance and stability.
LLMs can detect rhetorical questions with surprisingly high accuracy, but that doesn't mean they understand rhetoric the way we do.