Search papers, labs, and topics across Lattice.
Yonsei University
3
0
6
0
Data contamination leaves a tell-tale geometric fingerprint across LLM layers, detectable even when standard output-based methods fail after RL post-training.
Agents excel at using tools but falter significantly in navigation, with errors dominating their performance in complex tasks.
LM Arena's model anonymity is more vulnerable than previously thought: a new attack, INTERPOL, leverages interpolated preference learning to expose deep stylistic patterns and manipulate rankings.