Search papers, labs, and topics across Lattice.
1
0
3
2
Achieve a 50% inference speedup on a large language model for European languages by compressing it to 7.35B parameters, while retaining 90% of the original 11B parameter model's performance.