Search papers, labs, and topics across Lattice.
100 papers published across 4 labs.
Requirements volatility doesn't just delay projects; it directly undermines software architecture, leading to technical debt and scheduling nightmares.
Unlock geometric algebra's performance potential in neural networks and spatial computing by compiling directly from multi-way relationships, eliminating manual specialization and ensuring geometric correctness.
Multi-party function secret sharing just got a whole lot more practical: a new DDH-based scheme slashes key sizes by up to 10x.
AdaMuS overcomes the bias towards high-dimensional data in multi-view learning by adaptively pruning redundant parameters and sparsely fusing views, leading to improved performance on dimensionally unbalanced data.
LLMs aren't just better tools; they're forcing us to rethink the very nature of information, knowledge, and meaning in system design.
AdaMuS overcomes the bias towards high-dimensional data in multi-view learning by adaptively pruning redundant parameters and sparsely fusing views, leading to improved performance on dimensionally unbalanced data.
LLMs aren't just better tools; they're forcing us to rethink the very nature of information, knowledge, and meaning in system design.
The field of video understanding is rapidly shifting from isolated pipelines to unified models capable of adapting to diverse downstream tasks, demanding a re-evaluation of current approaches.
Achieve controllable and scalable speech generation with MOSS-TTS, enabling zero-shot voice cloning and long-form synthesis.
Forget finetuning – Kumiho's graph-native memory lets you swap in a better LLM and instantly double your agent's reasoning accuracy on complex cognitive tasks.
Video diffusion transformers exhibit a hidden "magnitude hierarchy" in their activations that can be exploited for training-free quality improvements via a simple steering method.
Forget geometric LODs: tokenizing 3D shapes by semantic salience unlocks SOTA reconstruction and efficient autoregressive generation with 10x-1000x fewer tokens.
Forget scaling laws: dropout robustness in transformers is a lottery, with smaller models sometimes showing perfect stability while larger models crumble under stochastic inference.
Generate consistent stereo videos directly from RGB data, bypassing depth estimation and monocular-to-stereo conversion, with StereoWorld's novel camera-aware attention mechanisms.
Unlock faster, more accurate interlinear glossing for low-resource languages by treating morphemes as atomic units, outperforming existing methods and enabling user-guided lexicon expansion without retraining.
Generate realistic, atom-level molecular dynamics trajectories orders of magnitude faster with a novel State Space Model that captures long-range dependencies in biomolecular systems.
Ditch costly PIDE integration: RHYME-XT learns the flow map directly, offering a continuous-time, discretization-invariant representation that beats state-of-the-art neural operators.
LLMs can get a massive multilingual boost, especially in low-resource languages, by offloading translation to specialized models and carefully aligning their representations.
Attention sinks aren't just a forward-pass phenomenon; they actively warp the training landscape by creating "gradient sinks" that drive massive activations.
Achieve single-pass alignment of multi-talker speech – a feat previously impossible – by modeling overlaps as shuffles.
Achieve near-optimal waveform optimization with 98.8% spectral efficiency using a 5-layer, AutoML-tuned unrolled proximal gradient descent network trained on just 100 samples.
Software architecture, a critical but underspecified domain, finally gets a unified benchmarking platform with ArchBench, enabling standardized evaluation of LLMs on complex system design tasks.
Injecting "beneficial noise" into cross-attention mechanisms can significantly improve unsupervised domain adaptation by forcing models to focus on content rather than style distractions.
Ruyi2.5 achieves comparable performance to Qwen3-VL on general multimodal benchmarks while significantly outperforming it in privacy-constrained surveillance, demonstrating the effectiveness of its edge-cloud architecture.
Requirements volatility doesn't just delay projects; it directly undermines software architecture, leading to technical debt and scheduling nightmares.
Synthesizing realistic 6-DOF object manipulation trajectories in complex 3D environments just got a whole lot better with GMT, a multimodal transformer that substantially outperforms existing methods.
By disentangling semantic and contextual cues in vision-language models, PCA-Seg achieves state-of-the-art open-vocabulary segmentation with only 0.35M additional parameters per block.
Achieve up to 2.4x speedup over OpenBLAS on RISC-V by using MLIR and xDSL to generate optimized RVV code, finally unlocking the potential of RISC-V vector extensions.
Training video diffusion models with pixel-wise losses just got a whole lot cheaper: ChopGrad reduces memory complexity from linear to constant with video length.
Graph transformers avoid oversmoothing in deep layers by structurally preserving community information, a theoretical advantage over GCNs revealed through Gaussian process limits.
Cycle consistency training unlocks stable and accurate inverse kinematics for wearable soft robots, even with their inherent nonlinearities and hysteresis.
Convolutional Neural Operators (CNOs) surprisingly excel at capturing translated dynamics in the FitzHugh-Nagumo model, despite other architectures achieving lower training error or faster inference.
Forget prompt engineering: this new region proposal network spots objects across diverse datasets without *any* text or image prompts.
Infinite neural nets can be sparse, and this paper proves it, showing that total variation regularization provably yields sparse solutions in infinite-width shallow ReLU networks, with sparsity bounds tied to the geometry of the data.
Ditch the feature engineering: Baguan-TS lets you use raw time series sequences directly for in-context forecasting, outperforming traditional methods.
Ditch quadratic attention bottlenecks: this new transformer variant achieves competitive time-series forecasting with O(N log N) complexity by representing sequence states on a unit circle.
Lossless compression can actually *speed up* LLM inference on GPUs, not just shrink model size, thanks to ZipServ's hardware-aware design.
Enterprise AI can achieve 50% token reduction and zero cross-entity leakage by implementing a shared, governed memory architecture for multi-agent workflows.
Unlock geometric algebra's performance potential in neural networks and spatial computing by compiling directly from multi-way relationships, eliminating manual specialization and ensuring geometric correctness.
Forget training behemoths: ADMs slash memory overhead to just twice the inference footprint while guaranteeing geometric correctness and continuous adaptation.
Achieve significant latency and energy savings in memory systems with an RL-based controller that also provides insights into *why* its decisions are optimal.
Ditch backprop's limitations: this synthesizable RTL implementation brings predictive coding networks to life in fully distributed hardware.
KANs get a 50x BitOps reduction without accuracy loss by quantizing their B-splines down to 2-3 bits and using lookup tables.
Acoustic and phonetic NACs encode accent in fundamentally different ways, with implications for how we interpret and manipulate these representations.
By explicitly modeling and predicting non-stationary factors in both time and frequency domains, TimeAPN significantly boosts the accuracy of long-term time series forecasting, outperforming existing normalization techniques.
LLMs can be drastically compressed without retraining because the relative ordering of weights matters far more than their exact values, opening the door to efficient, training-free compression techniques.
Forget SVD: CARE aligns low-rank attention approximations with input activations, boosting accuracy up to 1.7x and slashing perplexity by 215x when converting models to multi-head latent attention.
No training needed: ARAM dynamically adjusts retrieved context guidance in masked diffusion models based on signal quality, resolving retrieval-prior conflicts on the fly.
By explicitly modeling pollutant propagation delays with neural delay differential equations, AirDDE significantly improves air quality forecasting accuracy.
AI's current limitations in adaptability stem from its reliance on psychological learning theories, suggesting a need for representational architectures where systematic behavior is inherent, not accidental.
Generative models can fail to produce globally consistent counterfactuals when causal graphs have complex topologies, but a novel sheaf-theoretic framework with entropic regularization can overcome these limitations.
Achieve 4K image-to-video generation with diffusion models without training by cleverly fusing tiled denoising with a low-resolution latent prior, balancing detail and global coherence.
A simple adaptive normalization technique can significantly improve continual learning performance on tabular data by mitigating catastrophic forgetting in dynamic environments.
Synthesizing realistic intermediate video frames just got a whole lot better, thanks to a novel attention mechanism that anchors to keyframes and text prompts for improved consistency and semantic alignment.
Achieve SE(3) equivariance and memory scalability in point cloud analysis with coordinate-based kernels, outperforming state-of-the-art equivariant methods on diverse tasks.
Normalizing error signals, not just activations, is the key to unlocking the benefits of inhibition-mediated normalization for learning in neural networks.
Transformer LMs learn linguistic abstractions before memorizing specific lexical items, mirroring key aspects of human language acquisition.
Mamba, the darling of sequence modeling, now powers a GAN that beats StyleGAN2-ADA in image synthesis, thanks to a clever latent space routing trick.
Secure enclave updates and migrations, previously missing from RISC-V TEEs, are now practical thanks to a novel toolkit that adds minimal overhead.
Multi-party function secret sharing just got a whole lot more practical: a new DDH-based scheme slashes key sizes by up to 10x.
LLMs struggle with code comprehension, but a simple RNN pass over their embeddings can boost accuracy by over 5%.
By mapping permutations to a continuous space of "soft ranks," this new diffusion approach makes learning permutation distributions far more tractable, especially for long sequences.
By reorganizing 3D scenes into structurally-aware subscenes, S-VGGT offers a parallel geometric bridge for efficient processing, slashing global attention costs without compromising reconstruction fidelity.
Ditch the polar decomposition: MUD offers a surprisingly simple and efficient alternative for momentum whitening, speeding up transformer training by up to 50% compared to AdamW and Muon.
AI spots a hidden pattern in lung scans of lupus patients, revealing that specific airway dilations in the upper lobes could be a telltale sign of interstitial lung disease.
Achieve competitive video generation with Stable Diffusion using only 2.9% additional parameters by adapting temporal attention based on motion content, outperforming methods with explicit temporal consistency losses.
LLMs can maintain performance while skipping global attention for 80% of tokens, slashing compute costs and memory footprint in long-context scenarios.
Predicting permeability tensors from microstructure images just got 33% more accurate thanks to a physics-informed CNN-Transformer that learns faster and generalizes better via pretraining and differentiable constraints.
Panoramic 3D reconstruction gets a boost with PanoVGGT, a Transformer that handles spherical distortions and global-frame ambiguity to deliver state-of-the-art accuracy in a single pass.
Reproducibility in hardware reverse engineering is shockingly low, with only 4% of evaluated artifacts from 187 papers yielding reproducible results.
Federated Computing as Code lets you enforce data sovereignty in federated systems with cryptographic guarantees, moving beyond runtime policies and trust assumptions.
Multilingual transformers spontaneously learn a geometric representation of language distance, and we can extract it to improve low-resource translation.
Forget collapsing videos into text – this hierarchical grid lets you zoom into any moment with lossless visual fidelity, unlocking logarithmic compute scaling for long-form video understanding.
LLMs aren't monolithic black boxes: they contain spatially organized, functionally specialized modules that can be automatically discovered.
Forget dropout – Gaussian Chaos Noise offers provable control over representation deformation and boosts calibration in deep networks.
Instance-specific timestep schedules can significantly boost diffusion model performance, challenging the reliance on global discretization strategies.
Autoregressive neural surrogates can now simulate dynamical systems for infinitely long horizons, thanks to a novel self-refining diffusion model that avoids error compounding.
LLM serving systems can boost Time-To-First-Token (TTFT) attainment by up to 2.4x simply by prioritizing network flows based on a novel approximation of Least-Laxity-First scheduling.
Forget slow, single-SSD paging: Swarm unlocks 2.7x higher bandwidth for LLM KV-cache offloading by exploiting stable co-activation patterns to parallelize I/O across multiple SSDs.
Jointly training audio watermarking and source separation unlocks robust multi-stream watermarking, enabling independent tracking of individual audio components within a mix.
Compressing images into 1D token sequences can yield state-of-the-art reconstruction fidelity, challenging the necessity of 2D spatial grids for visual tokenization.
Text-heavy fine-tuning is blinding your MLLM to crucial 3D spatial information, but GAP-MLLM's geometry-aligned pre-training can restore its sight.
By explicitly modeling tooth relationships, TCATSeg achieves state-of-the-art accuracy in 3D dental model segmentation, even in challenging pre-orthodontic cases.
Forget quadratic attention: FEAT achieves state-of-the-art performance on structured data with linear complexity and 40x faster inference.
Feature models, often treated as static configuration spaces, reveal hidden structural patterns and domain-specific deviations when viewed through the lens of network analysis.
Diffusion models can now capture nuanced semantic and material details in image stylization, moving beyond simple color-driven transformations, thanks to a Mixture of Experts architecture.
Masked diffusion language models can now achieve 21.8x better compute efficiency than autoregressive models, thanks to binary encoding and index shuffling.
Ditch the separate models: CAST-TTS uses a single cross-attention mechanism to control TTS timbre from both speech and text, rivaling specialized models in quality.
Forget one-hot encodings: conditioning timbre VAEs on continuous perceptual features unlocks more compact and controllable latent spaces.
DINOv2's powerful visual features come with a hidden flaw: strong positional biases that ALiBi positional encoding can effectively mitigate.
Autonomous vehicles can now see through the storm: a new Mixture of Experts approach boosts 3D object detection accuracy by 15% in adverse weather, without slowing things down.
RepoReviewer tackles the complexity of repository-level code review with a multi-agent architecture, breaking down the monolithic process into manageable stages for more relevant and efficient feedback.
Software energy consumption isn't just an aggregate number – it's a path-dependent journey, and this new model reveals hidden optimization opportunities that can slash energy use by up to 705x.
By combining feed-forward 3D reconstruction with a geometry-aware diffusion model, Leveling3D fills in the gaps in extrapolated novel views, leveling up both 3D reconstruction and generation.
By injecting biological heuristics into a deep learning pipeline, this method achieves state-of-the-art performance in classifying rare white blood cell subtypes, a task where standard deep learning models often fail.
You can now train graph transformers that generalize across different mesh resolutions, thanks to a new architecture that maintains gauge invariance while scaling linearly.
SympFormer achieves faster convergence in attention blocks by drawing inspiration from inertial Nesterov acceleration, offering a potential speedup without additional computational cost.
By forcing a model to reconstruct aggressively masked EEG spectrograms, SpecMoE learns intricate neural patterns across both high- and low-frequency domains, leading to state-of-the-art cross-species EEG decoding.
DynamicGate-MLP learns to selectively activate MLP units based on the input, achieving better compute efficiency without sacrificing performance.
Transformers have a hidden symmetry: depth-wise residuals are secretly doing the same thing as sequence-wise sliding window attention, unlocking new architectural insights.
Fine-tune 123B+ parameter models on a single RTX 4090 with SlideFormer, a system that achieves up to 6x larger models and 8x larger batch sizes.
Achieve state-of-the-art performance in continuous sign language recognition with 70-80% fewer parameters by unifying spatial and temporal attention.
Achieve sub-microsecond decoding-feedback latency in a scalable, open-source QEC system, bringing fault-tolerant quantum computation closer to reality.
Unfolding the EM algorithm into a neural network yields a speaker localization method that's more robust and accurate than traditional Batch-EM, especially in challenging acoustic conditions.
Deep learning slashes design time for high-efficiency Doherty power amplifiers, enabling complex pixelated combiners that extend the back-off efficiency range.