Search papers, labs, and topics across Lattice.
Shanghai Jiao Tong University
5
0
10
Reward models that adapt to fine-grained, task-specific criteria can significantly improve text-to-image generation by better aligning with user preferences.
LLM agents are surprisingly more sensitive to meaning-altering paraphrases than superficial formatting changes, even when those changes are designed to be equally disruptive.
Ditch the pixel-perfect edits: letting multimodal models fully *reimagine* images based on semantic understanding yields massive quality gains in refinement tasks.
LLMs' hallucinations stem from a "gray zone" of internal belief ambiguity near knowledge boundaries, and geometric denoising in the latent space offers a surprisingly effective way to purge it.
Diffusion Transformers get a 2x speed boost without sacrificing quality thanks to a new routing mechanism that dynamically skips computations based on input sample sparsity.