Search papers, labs, and topics across Lattice.
2
0
6
13
Forget complex disentanglement architectures or low-quality synthetic targets: MimicLM achieves superior voice imitation by cleverly using synthetic speech as the *source* and real speech as the *target* in a pseudo-parallel training setup.
Speech LLMs struggle not just from a simple input distribution shift, but from the challenge of condensing redundant acoustic information into stable, high-level semantic representations.