Search papers, labs, and topics across Lattice.
Huazhong University of Science and Technology, Wuhan, China
3
0
6
0
Safety in MoE LLMs isn't about routing harmful requests to "refusal experts"鈥攊t's surprisingly localized within specific experts, and you can break it without significantly changing the model's overall routing behavior.
Subject-specific variability in biomedical time-series can be mitigated by explicitly aligning spectral structure, leading to a 6% F1-score improvement over existing methods.
Merging seemingly safe LLMs can create dangerously misaligned models, thanks to a new "TrojanMerge" attack that exploits latent vulnerabilities.