Search papers, labs, and topics across Lattice.
2
0
3
5
G-STAR tackles long-form, multi-speaker ASR by giving Speech-LLMs time-aware speaker tracking, enabling robust identity linking across chunks.
Ditch slow, multi-step sampling for target speaker extraction: AlphaFlowTSE achieves faster, one-step generation with improved speaker similarity and real-world generalization.