Search papers, labs, and topics across Lattice.
Baidu Inc.
2
0
4
6
A 7B parameter model, optimized with multi-task learning and RL, rivals the timeline summarization performance of a 671B parameter model, proving that task-specific fine-tuning can dramatically shrink model size without sacrificing quality.
Multimodal embeddings get a serious upgrade with CoCoA, a new pre-training method that forces models to compress all input information into a single token for reconstruction, leading to substantial quality gains.