Search papers, labs, and topics across Lattice.
2
0
4
32
MoEs don't always need learned routers: routing information can be embedded directly in the hidden state.
Ditch the text: WavSLM shows you can train a competitive speech language model using only distilled WavLM representations, unlocking a simpler, single-stream generative pretraining paradigm for speech.