Search papers, labs, and topics across Lattice.
2
0
5
0
Achieve human-like full-duplex voice interactions with SoulX-Duplug, a plug-and-play module that slashes latency and improves turn management by acting as a semantic VAD.
Human-preference aligned audio generation from video is now possible, as V2A-DPO surpasses previous methods by directly optimizing for semantic consistency, temporal alignment, and perceptual quality.