Search papers, labs, and topics across Lattice.
3
0
5
Stop sacrificing subject fidelity for editability: DisCo lets you have both in text-to-image generation by disentangling and recoupling visual and textual information.
Current subject-driven text-to-image models struggle with specific subject categories and prompt scenarios, a problem exposed by a new benchmark that also offers actionable insights for improvement.
LLMs can perfectly cluster speakers in overlapping multi-party conversations, enabling near-perfect Joint ASR-Clustering Error Rate in challenging CHiME-9 tasks.