Souvik Maji

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Constitutional AI & AI Ethics (1)Red-Teaming & Adversarial Robustness (1)RLHF & Preference Learning (1)

Frequent co-authors

Jyotin Goel (1)Pratik Mazumder (1)

Papers (1)

Feb 19, 2026

Jyotin Goel +23w ago

Learning to Stay Safe: Adaptive Regularization Against Safety Degradation during Fine-Tuning

Fine-tuning LLMs doesn't have to trash their safety: adaptively regularizing updates based on predicted harmful intent keeps models aligned without sacrificing utility.

Jyotin Goel, Souvik Maji, Pratik Mazumder

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness RLHF & Preference Learning

Search

Souvik Maji

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)