Search papers, labs, and topics across Lattice.
4
0
7
LLMs respond to increasingly difficult out-of-distribution inputs by activating sparser representations in their last hidden states, revealing a quantifiable relationship between task difficulty and neural activity.
PathMoE reveals the specific modality interactions driving individual predictions in pediatric brain tumor classification, offering crucial interpretability for rare tumor subtypes.
DLMs aren't truly parallel because their training data is too sequential, but NAP shows how data curation can unlock genuine parallel decoding and boost reasoning performance.
Solve SMoE load balancing at inference time without retraining by replicating heavily used experts and quantizing underutilized ones, achieving up to 1.4x imbalance reduction.