Search papers, labs, and topics across Lattice.
This paper introduces a data-driven pipeline to improve LLM function calling for online financial question answering, addressing the challenge of adapting generic LLMs to the financial domain and handling diverse user queries. The pipeline involves constructing and periodically updating a dataset with real user queries and a novel data augmentation method called AugFC, which explores diverse parameter values. Experiments on offline datasets and online deployment demonstrate the superiority of the pipeline, which has been adopted in the YuanBao financial QA platform.
Fine-tuning LLMs with a data-driven pipeline that incorporates real user queries and a new augmentation method (AugFC) dramatically improves function calling performance in online financial QA systems.
Large language models (LLMs) have been incorporated into numerous industrial applications. Meanwhile, a vast array of API assets is scattered across various functions in the financial domain. An online financial question-answering system can leverage both LLMs and private APIs to provide timely financial analysis and information. The key is equipping the LLM model with function calling capability tailored to a financial scenario. However, a generic LLM requires customized financial APIs to call and struggles to adapt to the financial domain. Additionally, online user queries are diverse and contain out-of-distribution parameters compared with the required function input parameters, which makes it more difficult for a generic LLM to serve online users. In this paper, we propose a data-driven pipeline to enhance function calling in LLM for our online, deployed financial QA, comprising dataset construction, data augmentation, and model training. Specifically, we construct a dataset based on a previous study and update it periodically, incorporating queries and an augmentation method named AugFC. The addition of user query-related samples will \textit{exploit} our financial toolset in a data-driven manner, and AugFC explores the possible parameter values to enhance the diversity of our updated dataset. Then, we train an LLM with a two-step method, which enables the use of our financial functions. Extensive experiments on existing offline datasets, as well as the deployment of an online scenario, illustrate the superiority of our pipeline. The related pipeline has been adopted in the financial QA of YuanBao\footnote{https://yuanbao.tencent.com/chat/}, one of the largest chat platforms in China.