Search papers, labs, and topics across Lattice.
8
0
9
7
Stop reinventing the wheel: OpenWorldLib offers a unified framework and codebase for advanced world models, finally bringing standardization to a fragmented field.
Achieve robust, real-time 3D multi-object tracking in panoramic views by representing object states on a sphere, sidestepping the limitations of image-plane trackers and redundant Euclidean formulations.
World models can now remember and realistically regenerate dynamic objects that temporarily disappear from view, thanks to a novel hybrid memory architecture.
Generate multi-shot videos at 16 FPS with a single GPU and interactively steer the narrative in real-time, thanks to a novel causal architecture that overcomes the limitations of bidirectional models.
Achieve lifelike character animation with 10x faster inference using Kling-MotionControl, a DiT-based framework that intelligently handles body, face, and hand motions.
Forget single-number video quality scores: UltraVQA and Analytic Score Optimization (ASO) unlock richer, multi-faceted evaluations that better align with human preferences.
Forget generic CoT: Embed-RL uses reinforcement learning to generate reasoning traces that are explicitly optimized for multimodal embedding tasks, leading to significant performance gains.
Forget separate image and video models: VINO's single diffusion backbone handles both, opening the door to truly unified visual creation and editing.