Search papers, labs, and topics across Lattice.
2
0
5
5
Fine-tuning LLMs with a data-driven pipeline that incorporates real user queries and a new augmentation method (AugFC) dramatically improves function calling performance in online financial QA systems.
LLM agents can learn to solve complex, long-horizon tasks much more effectively by using themselves as post-hoc critics to refine Q-values through hindsight reasoning.