Search papers, labs, and topics across Lattice.
A novel visual encoder is designed that aims to obtain summary-oriented visual features to help generate higher-quality summaries and introduces a minimum margin loss to suppress the overconfidence problem of the model when generating text during reasoning.