Search papers, labs, and topics across Lattice.
4 papers published across 2 labs.
Forget expensive downstream evaluations: token-level statistics from expert-written solutions can reliably forecast LLM performance with 10,000x less compute.
Stop wasting compute on irrelevant actions: targeted hindsight self-distillation focuses LLM agent training on the critical failure points, boosting performance and slashing training time.
Full-attention LLMs are intrinsically sparse and can be transformed into highly efficient sparse models with minimal training, sidestepping the need for expensive sparse pre-training.
Stop IP thieves cold: LoREnc lets you lock down your foundation models and LoRA adapters without retraining, crushing model recovery attacks while keeping performance intact for authorized users.