Search papers, labs, and topics across Lattice.
2
0
5
3
LLM watermarks can now survive fine-tuning, quantization, and distillation thanks to a new method that embeds them in a stable functional subspace.
By enforcing graph isomorphism across counterfactual inputs, UGID reveals that debiasing LLMs can be achieved by directly manipulating internal representations and attention mechanisms.