Search papers, labs, and topics across Lattice.
2
0
5
Erasing unwanted concepts from text-to-image models doesn't require retraining the whole U-Net — surgically misdirecting the text encoder's early self-attention layers does the trick with minimal collateral damage.
LVLMs can be trained more effectively by focusing only on the most visually informative samples and tokens, as measured by a new Visual Information Gain metric.