Search papers, labs, and topics across Lattice.
2
0
3
2
Multi-FPGA partitioning gets a 52% communication efficiency boost and 11x speedup by explicitly optimizing for network topology and logic replication, blowing away prior art.
Achieve practical FHE inference for Llama-3-8B with sub-100 second token generation by cleverly integrating KV caching, leaving prior art in the dust.