Search papers, labs, and topics across Lattice.
5
0
10
Dynamic allocation of compute resources based on attention entropy can yield significant speedups in long-context LLM inference without sacrificing quality.
User feedback prediction can be learned, revealing critical performance gaps in AI assistants that traditional evaluation methods overlook.
Reinforcement learning can efficiently merge heterogeneous language models in federated ASR, outperforming genetic algorithms and improving character error rate.
CiteLLM offers a LaTeX-integrated agent that grounds claims in trusted academic sources, bypassing LLM hallucinations by using them only for search and ranking, not content generation.
Automate customer service without complex orchestration or sacrificing privacy by using task-oriented flowcharts to guide small, locally deployed language models.