Search papers, labs, and topics across Lattice.
33
0
19
0
Forget compressing entire tokens – selectively routing *parts* of tokens based on query relevance unlocks better compression-quality tradeoffs in LoRA-adapted transformers.
Achieve state-of-the-art remote sensing image-text retrieval without the computational burden of large-scale vision-language model pre-training, thanks to a novel two-stage approach.
Dramatically improve multimodal recommendation accuracy without any training by initializing user embeddings with item modality features and user cluster information.
LLMs are still far from being autonomous scientists, failing to master even simplified, end-to-end physics research workflows.
Neural video codecs can be designed for biological substrates from the ground up, unlocking a new paradigm for DNA storage.
AMRs can now navigate reliably indoors without GPS or external infrastructure, thanks to a new method that simultaneously calibrates magnetometers and estimates robot pose.
Aligning speech VAEs with SSL features isn't a one-size-fits-all game: joint-marginal alignment with adaptive weighting unlocks superior performance across reconstruction, understanding, and generation.
Achieve industrial anomaly detection that not only locates defects, but explains them and generates controlled edits, all in one model.
Stop treating diffusion workflows as monolithic black boxes: LegoDiffusion unlocks 3x higher throughput by decomposing them into independently scalable microservices.
Squeeze 34% more decode speed out of your MoE model without sacrificing accuracy by intelligently budgeting expert activations.
CubeGraph achieves superior RAG performance by unifying vector and spatial search, eliminating the overhead of fragmented sub-index invocations common in existing systems.
VLMs in self-driving cars are shockingly vulnerable: a subtle combination of graffiti and foreign-language commands can hijack their behavior without degrading performance on normal tasks.
Multimodal LLMs are surprisingly vulnerable to backdoor attacks, but a simple patch-based augmentation and cross-view regularization can drastically improve robustness without sacrificing performance.
Stop treating tests as immutable oracles: letting repair agents revise behavioral constraints during search dramatically improves issue resolution.
Achieve near-lossless 4-bit quantization for LLMs in under a minute, without full fine-tuning, by correcting for non-uniform activation distributions.
LLMs can learn to reason *worse* from seemingly better training data: models trained on CoT data with lower loss can generalize poorly due to inheriting inefficient, divergent reasoning patterns.
Achieve significantly better structure preservation in text-guided image editing by injecting structure-related features into visual autoregressive models, guided by reinforcement learning.
LongCat-Next shatters the language-centric paradigm by unifying text, vision, and audio into a single autoregressive model with minimal modality-specific design, finally reconciling understanding and generation in discrete vision modeling.
Color image restoration gets a boost: exploiting saturation-value similarity in nonlocal methods yields significantly better results than relying on individual RGB channels.
Lossless compression can actually *speed up* LLM inference on GPUs, not just shrink model size, thanks to ZipServ's hardware-aware design.
Sports expose surprising limitations in VLMs' spatial reasoning, as current models struggle to generalize from existing benchmarks despite fine-tuning gains on a new, large-scale dataset.
Unleashing heterogeneous robot swarms: a new data-driven method achieves cooperative localization even with sparse, unidirectional measurements, sidestepping restrictive geometric constraints.
Instruction-based image editing models still struggle to edit small objects, with a new benchmark revealing significant performance gaps despite progress on existing benchmarks.
ARLArena reveals the hidden instability of agentic RL, offering a path to more reliable LLM-based agents via a novel stable policy optimization method (SAMPO).
Ditch the quadratic cost and black-box nature of neural operators: Gaussian Particle Operators offer interpretable, near-linear PDE learning by representing fields as learned Gaussian atoms.
Achieve state-of-the-art LDCT image restoration with a Green Learning approach that's mathematically transparent, computationally efficient, and memory-friendly.
Achieve up to 5.48x speedup in merging proximity graph indexes for AKNN search by intelligently exploiting structural information, outperforming naive reconstruction by nearly 10x.
By randomly attending to different time patches and progressively mixing scales, SEMixer achieves state-of-the-art long-term time series forecasting with a lightweight architecture.
Stop letting noisy, low-predictability data ruin your time series models: APTF dynamically identifies and penalizes these samples during training, leading to improved forecasting and classification accuracy.
Forget small, curated datasets: DeepVision-103K unlocks stronger multimodal reasoning in LMMs via diverse, verifiable visual math problems.
This consensus provides expert recommendations for DAA-HJA in elderly patients with FNF, addressing key clinical dilemmas and promoting standardized surgical techniques.
Key contribution not extracted.
This review highlights the unique challenges in managing PJI after tumor megaprosthetic reconstruction, emphasizing the need for tailored diagnostic and treatment strategies due to the elevated risk compared to standard arthroplasty.