Search papers, labs, and topics across Lattice.
100 papers published across 6 labs.
Graph models can now generalize to entirely new datasets with different input features, thanks to a simple projection into a shared random space.
Achieve near-lossless 60% attention latency reduction in video editing by exploiting query sharpness to dynamically route attention.
Transformers may succeed at time series forecasting without relying on the complex superposition that drives their power in NLP, challenging the assumption that these models are leveraging rich compositional representations.
Outlier tokens in Diffusion Transformers aren't just extreme values; they corrupt local patch semantics, and can be tamed with Dual-Stage Registers to boost image generation quality.
Transformers can be explicitly designed to perform nonlinear regression in-context by leveraging attention as a featurizer, offering a theoretical understanding of how these models learn complex relationships from prompts.
Achieve near-lossless 60% attention latency reduction in video editing by exploiting query sharpness to dynamically route attention.
Transformers may succeed at time series forecasting without relying on the complex superposition that drives their power in NLP, challenging the assumption that these models are leveraging rich compositional representations.
Outlier tokens in Diffusion Transformers aren't just extreme values; they corrupt local patch semantics, and can be tamed with Dual-Stage Registers to boost image generation quality.
Transformers can be explicitly designed to perform nonlinear regression in-context by leveraging attention as a featurizer, offering a theoretical understanding of how these models learn complex relationships from prompts.
Forget scaling laws – the real bottleneck in associative memory isn't storage, it's retrieval: forcing a single "winner" costs you a logarithmic factor in capacity compared to allowing a ranked list.
Skip the sampling: accurately predict the behavior of wide, random MLPs with a fraction of the compute, especially when assessing rare, high-stakes outcomes.
GMD algorithms, previously seen as a novel generative framework, can be understood as directly targeting fixed points of Wasserstein Gradient Flows, offering a new perspective on their optimization process.
Modeling 10,000+ correlated outputs is now tractable: T-LVMOGP offers a scalable alternative to restrictive low-rank MOGPs by learning a flexible deep kernel in a shared embedding space.
Forget rigid memory structures: Memini lets your LLM's external knowledge evolve organically, learning and forgetting like a brain.
Infinite-width approximations, a cornerstone of neural network theory, crumble much faster in recurrent models than previously thought, failing beyond a depth of order $\sqrt{n}$.
Doubly sparse regression gets a boost: this method avoids predictor duplication, saving compute, by projecting directly onto the intersection of selected groups.
Training MoE models just got a whole lot faster: Piper achieves up to 3.5x higher MFU by intelligently scheduling pipeline parallelism and optimizing communication.
Long-context models face a provable "impossibility triangle": you can't have efficiency, compactness, and unbounded recall *at the same time*.
Forget trying to shoehorn hypergraphs into pairwise representations – this diffusion model directly generates them from incidence matrices, unlocking more realistic and complex structures.
Scale multi-agent RL diversity metrics to hundreds of agents without sacrificing accuracy: Graph-SND offers a drop-in replacement for quadratic SND calculations, achieving near-identical results with order-of-magnitude speedups.
LLMs can now generate high-performance CUDA attention kernels that outperform hand-optimized code, thanks to a novel lift-transfer-lower approach that leverages expert knowledge.
Forget fine-tuning: "skill neologisms"—new soft tokens—let you inject skills into LLMs without weight updates, composing them zero-shot for flexible knowledge expansion.
Inverting time-domain marine electromagnetic data, a traditionally computationally intensive task, can now be done 21,000x faster with a deep learning model that also outperforms traditional optimization methods.
Geometric continuity in deep networks isn't just a byproduct of depth, but an actively sculpted property arising from the interplay of residual connections and symmetry-breaking activations.
Granular Mixture-of-Experts can now be efficient: AIR-MoE's two-stage routing slashes routing costs without sacrificing performance.
Batch normalization's power comes from reshaping the geometry of neural network decision boundaries on a per-batch basis, not just from optimization benefits.
Neural networks can now discover previously unknown behavior in hard PDE problems, revealing that Strichartz extremizers for the critical Airy equation are not attained but approached by mKdV breathers.
LLMs can be efficiently post-trained by only updating half the parameters, slashing memory costs without sacrificing performance.
LLMs can now generate neural architectures with 75% less code and higher accuracy by learning to write code "diffs" instead of building from scratch.
Token embedding geometry isn't just abstract math—it directly mirrors how language models internally represent and reason about the world, as shown by its alignment with board state and piece importance in chess.
Symmetric spectral analysis of attention is fundamentally blind to information flow direction, but a simple asymmetry coefficient can restore the signal.
Graph models can now generalize to entirely new datasets with different input features, thanks to a simple projection into a shared random space.
Neural operators can stably and accurately correct the structured truncation errors of classical numerical solvers for dispersive PDEs, even with rough data.
Physics-informed neural operators can now learn continually without forgetting, thanks to a simple replay strategy that preserves past knowledge while rapidly adapting to new out-of-distribution data.
Diffusion models' reliance on global information isn't just a quirk – it's fundamentally linked to the moment they commit to a specific semantic outcome.
By explicitly modeling literal polarity in SAT formulas, GNNs can more accurately predict unsatisfiable cores.
Approximate computing can break MoEs in unexpected ways, with dense networks sometimes proving more robust, but careful retraining can unlock surprising efficiency gains in specific architectures.
Control-dependent latent dynamics, achieved with a surprisingly small parameter increase, unlock robust MPC performance in time-varying environments where standard Koopman methods falter.
Unlock white-box inference for SOC-ICNNs by directly reading out geometric primitives like Hessians from the optimal dual variables, bypassing black-box differentiation.
Forget opaque transformers: Gyan offers SOTA language modeling with full interpretability, lower compute, and human-like compositional understanding.
Ditch the attention: ConvRec proves convolutional networks can beat Transformers in sequential recommendation while slashing compute and memory costs.
MoEs, despite their scaling advantages, suffer from a surprising "spectral plasticity loss" in continual RL, but a simple Parseval penalty can recover performance.
By embedding whole-slide images in a hybrid hyperbolic-Euclidean space, BatMIL unlocks superior classification performance compared to traditional Euclidean-only methods, revealing the importance of geometric awareness in capturing complex tissue organization.
Hallucinations in diffusion models aren't just mode interpolation gone wrong, but instabilities on the model's manifold, and squashing its local intrinsic dimension can fix them.
Stop brittle, undeployable AI-generated code: this retrieval-augmented scaffolding method bakes in architectural constraints from the start.
A single vision-language foundation model, DART, can perform a full rope inspection workflow, including damage classification, severity estimation, and few-shot recognition, all without task-specific fine-tuning.
Shuffling activations, a popular defense in secure Transformer inference, crumbles under a new alignment attack that recovers model weights for just $1.
Ditch the vector DB – this new agent architecture achieves SOTA memory recall by storing everything verbatim and optimizing retrieval, all in a single SQLite file.
Transformers with average attention can natively execute arithmetic circuits, suggesting a new architectural direction for reasoning and computation.
Unlock scalable, high-quality singing voice synthesis by directly generating structured musical scores from audio, outperforming existing systems on multiple datasets.
HeterSEED achieves state-of-the-art performance on heterophilic heterogeneous graphs by decoupling semantic and structural information, offering a more robust approach than relying on feature similarity alone.
Freezing your VAE and permuting high-frequency visual signals unlocks a new SOTA for VLM prompt learning, boosting harmonic-mean accuracy to 81.51%.
CNN-BiLSTM beats AutoML for Indonesian hate speech detection, but the gains are modest, suggesting the dataset's limitations are a bigger bottleneck than model architecture.
Even state-of-the-art multilingual models struggle to tag parts-of-speech in Tajik when trained on isolated words, highlighting the critical role of syntactic context.
UniVer achieves state-of-the-art speculative decoding by jointly optimizing multi-step and multi-draft verification, outperforming existing methods by up to 8.5% in acceptance length.
LLMs get schooled in dialogue state tracking by a mixture-of-experts architecture that uses a graph neural network and ReAct agents to achieve state-of-the-art results with a T5-Small backbone.
Lattice-based cryptography's reliance on injected noise for security is more akin to hiding secrets under a rug than truly erasing them, leaving them vulnerable to future quantum attacks.
Remotely hosted Mixture-of-Experts LLMs are vulnerable to input-only attacks that hijack their routing mechanisms, forcing them to generate harmful content.
A clever routing strategy lets a tiny 3B code model outperform a massive 480B model on routine code completion tasks, slashing accelerator usage by 58%.
Proving semantic equivalence between LLVM IR and RISC-V code is now possible within a single framework, thanks to a new formal RISC-V semantics built on Interaction Trees.
Forget dataset-specific hacks: CPCANet achieves SOTA domain generalization by explicitly learning a structured, domain-invariant subspace with a differentiable CPCA layer.
Discrete diffusion, with carefully designed transition matrices for commands and parameters, unlocks superior CAD generation compared to continuous diffusion baselines.
Generate CT-like images from ultrasound with a transformer-augmented network, potentially reducing the need for harmful radiation exposure.
Overlooked diagonal epipolar geometry holds the key to boosting light field super-resolution, as demonstrated by a new omnidirectional EPI Transformer.
RL fine-tuning unlocks a 6x performance gain for in-place trajectory editing in autonomous driving, demonstrating the power of aligning diffusion planners with reinforcement learning.
Offloading communication to SmartNIC DPUs can speed up host-dominated workloads by 1.55x, but the lack of Direct Cache Access creates a massive DRAM bottleneck.
Implicit time integration on GPUs gets a 3x speed boost thanks to a novel algebraic coarsening method that avoids costly explicit remeshing.
Run billions of bitwise operations directly in your 3D NAND flash, error-free, using just standard instructions.
Exponent bits are the Achilles' heel of floating-point arithmetic, as corrupting them in RISC-V vector processors leads to the most severe silent data corruption.
Radically reduce power consumption in AI chips with a circuit-switched network-on-chip that carves out dedicated "lanes" for predictable communication flows.
RangeGuard lets you tolerate 64+ flipped bits in DNN memory using just 16 bits of parity, without sacrificing accuracy.
Automating diabatization with neural networks unlocks accurate simulation of complex non-adiabatic molecular dynamics, revealing unexpected fragmentation pathways.
Bio-inspired signal processing lets you hear subtle underwater sounds better than ever, achieving 98.41% accuracy in classifying targets even in noisy conditions.
Unlock near-oracle speech enhancement performance from compact microphone arrays by virtually expanding their spatial coverage with a novel neural network.
Generative recommendation gets a boost: CapsID's soft-routed semantic IDs outperform hard-quantized baselines and even rival sparse-dense hybrids, all while slashing inference latency by nearly half.
Forget complex assembly: this 3D printing technique lets you pop out functional, self-folding robots with integrated sensors and actuators directly from a flat sheet.
By grounding temporal Gaussian aggregation in spatial voxels, Ground4D achieves state-of-the-art 4D reconstruction in challenging off-road environments where existing methods falter.
Brain tumor segmentation gets a lightweight boost: DALight-3D achieves comparable accuracy to larger U-Nets with significantly fewer parameters.
Unlock efficient 4D object understanding from dynamic point clouds with Velox, a representation that's descriptive, compressive, and accessible.
Explicitly modeling human-object interactions boosts multi-person human mesh recovery accuracy by up to 9.9%, showing that interaction context is key to understanding human pose and shape in complex scenes.
Mamba's linear complexity meets perceptual image compression, yielding a lightweight model that rivals GANs and diffusion models in visual quality while being far more efficient.
Generating synthetic training data with multi-modal diffusion beats hand-crafting better detection architectures for PCB defect inspection.
Random masking in self-supervised learning can destroy crucial diagnostic features in medical images; instead, try inverting chaos.
Spatial transcriptomics predictions get a boost from HEXST, a Transformer that respects the hexagonal geometry of spot arrays and recovers gene-specific spatial heterogeneity.
3D Gaussian Splatting gets a nearly 2x speed boost thanks to a clever bounding box strategy that drastically reduces unnecessary tile intersection checks.
Forget ImageNet – pre-training with chaotic augmentations yields surprisingly robust texture features, outperforming SOTA methods across diverse texture datasets.
Bidirectional interaction between enhanced understanding, controllable spatial editing, and novel-view-assisted reasoning enables a unified multimodal model to achieve spatial intelligence beyond general visual competence.
Get 4x faster LLM inference with Budgeted LoRA, which smartly redistributes compute between dense and low-rank pathways during distillation, outperforming standard LoRA in both speed and function-style in-context learning.
Forget boring rotary embeddings: Jordan-RoPE unlocks distance-modulated phase interactions in attention, letting your model learn relationships like "the further apart, the stronger the cosine similarity."
Domain match and language relatedness trump joint vocabularies for effective knowledge transfer in multilingual NMT.
SATFormer shows that selectively gating access to early-layer representations boosts Transformer performance, especially in retrieval tasks, without sacrificing efficiency.
Transformers generalize out-of-distribution not by clever interpolation, but by learning a separate, orthogonal representation subspace for unseen tasks.
CBAM could reshape Europe's electricity market, giving low-carbon countries a competitive edge while burdening high-carbon economies.
Forget trusted online policy enforcement points: this revocation-ready key management layer uses ciphertext key publication to enforce dynamic, multi-user authorization for releasing or using bulk-data decryption keys in blockchain-based IoT data sharing systems.
Get strong pointer integrity and confidentiality without metadata overhead: LIPPEN encrypts pointers in-place, turning every pointer into a cryptographically protected block.
Provably undetectable backdoors can be injected into pre-trained image classifiers, even with white-box access, by exploiting sparse perturbations and Gaussian dithering.
Stochastic sampling from p-bit Ising models can slash the search effort of CDCL SAT solvers by over 80% on certain problem instances.
Computation-in-memory combined with lightweight cryptography slashes energy consumption by up to 44% in steganography applications.
Forward-Forward learning can finally compete with backpropagation on complex image tasks, thanks to a novel covariance-aware goodness function that captures crucial second-order feature dependencies.
Achieve near order-of-magnitude reduction in tail timing error in mixed-criticality robotics by decoupling safety-critical control from user applications.
ClusterLess slashes workflow completion times by up to 40% and nearly doubles deadline satisfaction in federated edge environments, outperforming existing methods.
AI training jobs can now shrug off network failures that used to halt progress, thanks to a new resilient networking stack deployed at OpenAI and Microsoft.
Serverless orchestration falls apart when you move it to space, but this paper proposes a new architecture to fix it.
Get up to 40% performance boost and 15% energy savings on scientific computing kernels by offloading OpenMP loops to AMD's AI Engines with minimal code changes.
Forget simplistic roofline models: these analytical models nail GPU performance prediction on Blackwell and CDNA3 with under 1.5% error.