Search papers, labs, and topics across Lattice.
2
29
5
2
Achieve near-dense Video-LLM performance on long videos with up to 57% fewer FLOPs by adaptively selecting which video cubes and tokens to process.
GPT-4o now has open-source competition: Ming-Omni matches its modality support in a single, unified model capable of perception and generation across image, text, audio, and video.