Search papers, labs, and topics across Lattice.
5
0
8
0
Reasoning with LLMs just got a whole lot faster: MemoSight cuts KV cache footprint by 66% and speeds up inference by 1.56x without sacrificing CoT performance.
Multilingual question answering is harder than you think: even state-of-the-art RAG systems stumble when dealing with questions and knowledge in multiple languages.
Supervised fine-tuning can be dramatically improved by explicitly encouraging exploration of low-confidence data and suppressing high-confidence errors, leading to sustained gains in mathematical reasoning even after extensive RLVR training.
SER models, often assumed to generalize well to synthesized speech, actually fail miserably, revealing their reliance on spurious correlations rather than genuine emotional understanding.
SLMs still lag behind omni language models in multi-turn conversational style control, as revealed by the new StyleBench benchmark.