Search papers, labs, and topics across Lattice.
The paper introduces Chat-RSC, a novel method leveraging Large Language Models (LLMs) to facilitate interactive remote sensing image classification. Chat-RSC uses two agents to transform image characteristics into state machines and activities, allowing users to control the classification process through natural language via the LLM's function calling capabilities. Experiments demonstrate that users can achieve remote sensing classification goals with limited interactions, showcasing the potential of LLMs to bridge the gap between complex remote sensing tools and a broader audience.
Control remote sensing image classification with natural language, making complex tools accessible to a broader audience via LLMs.
ABSTRACT Large language models (LLMs) have demonstrated remarkable collaboration and interaction capabilities across numerous work domains. Therefore, could LLMs also contribute to remote sensing classification and interpretation, which are among the most fundamental tasks in the field of remote sensing? To answer this question, the ‘Chat with remote sensing image classification’ (Chat-RSC) method is proposed in this paper. Chat-RSC transforms the classified characteristics of remote sensing images into a series of state machines and activities and integrates activities, states, and the complex internal classification process using two agents to produce controllable classification functions. In Chat-RSC, the LLM acts as a bridge, linking the natural language descriptions of users’ remote sensing classification expectations with specific agent functions. It captures the key intentions and parameters of the users through the ‘function calling’ ability of the LLM, enabling control of remote sensing image classification process through conversion. In experiments, Chat-RSC enables control through natural language conversion, and the results show that users can achieve remote sensing classification goals with a limited number of interactions. Chat-RSC is presented as an innovative effort to integrate complex remote sensing functions and processing models via large language models, making these sophisticated tools more accessible to a broader audience.