Search papers, labs, and topics across Lattice.
12 papers from Google Research on Architecture Design (Transformers, SSMs, MoE)
MResOpt achieves significantly lower high-priority constraint violations in constrained optimization tasks while remaining computationally efficient, revolutionizing how we approach complex optimization problems.
PARCEL redefines visual tokenization, achieving superior efficiency and performance by dynamically anchoring feature extraction to spatial pool tokens.
Why pick just one token mixer when you can have them all, dynamically switching between attention and linear recurrences for optimal efficiency and performance?
Splitting attention and feedforward networks onto separate GPUs can unlock 4x higher MoE LLM throughput, but only if you carefully tune the GPU partitioning strategy based on the workload.
You can slash the compute cost of visual geometry transformers by 85% without sacrificing accuracy by intelligently pruning redundant tokens across frames and within layers.
Graph transformers can be fundamentally limited by their tokenization strategy, as some tokenizations provably preclude efficient learning of structural representations realizable with other tokenizations.
ZKP proving, previously bottlenecked by MSM and NTT operations, can now achieve up to 10x higher throughput on TPUs thanks to a novel framework that reformulates ZKP kernels for AI-ASIC execution.
CGRA performance jumps by 2.7x thanks to NEURA, a compilation framework that elegantly transforms control flow into dataflow.
Refining generative models with discriminator guidance provably improves generalization, offering a theoretical justification for techniques like score-based diffusion.
Forget catastrophic forgetting: this function-preserving expansion method lets you fine-tune without sacrificing pre-trained knowledge, matching full fine-tuning performance at a fraction of the cost.
Recurrent models can now achieve Transformer-competitive performance on recall-intensive tasks, thanks to a simple memory caching mechanism that grows memory capacity with sequence length.
Randomly masking parameter updates in RMSProp delivers state-of-the-art LLM training performance, revealing a surprisingly effective form of geometric regularization.