Search papers, labs, and topics across Lattice.
1
0
3
4
SpeechLLMs can be pruned by up to 40% of their decoder layers without significant performance degradation, suggesting substantial architectural redundancy inherited from the pre-trained LLM.