Search papers, labs, and topics across Lattice.
2
0
5
2
LVLMs are better at spotting their own mistakes than generating correct answers in the first place, and this self-awareness can be exploited to reduce hallucinations.
Self-distillation in LLMs can leak information and destabilize training, but combining it with verifiable rewards yields a sweet spot for improved convergence and stability.