Search papers, labs, and topics across Lattice.
2
0
5
Forget manual curation—aligning policy gradients with a validation set adaptively selects RL training data, leading to more stable LLM training and improved performance.
Ditching pre-recorded enrollment speech, this work shows how wake words can bootstrap target speech extraction, paving the way for more natural human-machine dialogues.