Search papers, labs, and topics across Lattice.
Shanghai AI Laboratory, Nanjing University
2
0
6
A 4B-parameter model, InternVL-U, punches above its weight, outperforming 14B-parameter models in multimodal generation and editing by using a novel data synthesis pipeline and architecture.
Context inconsistency in stepwise group-based RL can severely bias advantage estimation, but a hierarchical grouping strategy can fix it without extra compute.