Search papers, labs, and topics across Lattice.
Department of Computer Science, University of Copenhagen, Denmark
5
0
7
Output vector editing can suppress up to 87.9% of memorized sequences in large language models, significantly outperforming traditional neuron-level methods.
Context sensitivity in LLMs evolves significantly across training stages, revealing that SFT biases models towards simpler contexts that can be both reinforced and reshaped later on.
LLMs not only answer questions but also shape user perception through their framing, with insider positioning and anthropomorphism linked in unexpected ways across cultures.
LLM-labeled data can match human-labeled data in aggregate performance for hostility detection, but be warned: the errors are systematically different, especially in nuanced cases.
Language models often disregard provided context, choosing instead to rely on potentially outdated or conflicting information learned during pre-training, revealing a critical flaw in their knowledge integration.