Search papers, labs, and topics across Lattice.
Nanjing University
7
0
12
Current audio-language models are surprisingly bad at controlling and interpreting subtle vocal cues, failing in nearly half of situational dialogue scenarios.
Achieve near-lossless performance in autonomous driving VLMs with 90% token reduction – without any training.
Achieve state-of-the-art 3D Gaussian Splatting segmentation by identifying ambiguous Gaussians at object boundaries and enforcing spatial continuity via NeRF alignment.
RL can teach LLMs to be better interviewers, adaptively prompting users to reveal hidden information in dialogue.
Ditching the critic doesn't mean sacrificing fine-grained credit assignment: RTMC leverages overlapping states in rollout trees to estimate per-step Q-values, outperforming critic-free baselines on SWE-bench.
Hyperspectral anomaly detection gets a serious upgrade: R2VD ditches scalar reconstruction errors for high-dimensional vector interference patterns, achieving state-of-the-art target detection and background suppression.
Existing audio deepfake detectors are sitting ducks outside the lab: AT-ADD introduces a challenge to push research towards real-world robustness and generalization across all audio types.