Search papers, labs, and topics across Lattice.
Ningbo Institute of Digital Twin, Eastern Institute of Technology
1
0
3
Static depth pruning emerges as the most effective strategy for LLM acceleration, achieving near-theoretical speedup limits in memory-bounded contexts.