Search papers, labs, and topics across Lattice.
Hanyang University
1
0
3
3
LLM-augmented training with similarity-aware masking lets weakly-supervised video captioning models generate more accurate event descriptions and temporal boundaries, even with sparse training data.