BeihangSJTUMar 4, 2026arXiv:2603.04592

From Static Inference to Dynamic Interaction: Navigating the Landscape of Streaming Large Language Models

Junlong Tong, Zilong Wang, YuJie Ren, Peiran Yin, Xiaoyu Shen

AI Summary

This paper provides a unified definition and systematic taxonomy of streaming Large Language Models (LLMs), addressing the fragmented understanding of streaming generation, inputs, and interactive architectures. The authors establish a definition based on data flow and dynamic interaction, clarifying ambiguities in the field. They further explore applications and future research directions for streaming LLMs, accompanied by a continuously updated repository of relevant papers.

Key Contribution

Untangling the mess of "streaming LLMs," this paper delivers a clear taxonomy that distinguishes between streaming generation, streaming inputs, and interactive architectures.

Abstract

Standard Large Language Models (LLMs) are predominantly designed for static inference with pre-defined inputs, which limits their applicability in dynamic, real-time scenarios. To address this gap, the streaming LLM paradigm has emerged. However, existing definitions of streaming LLMs remain fragmented, conflating streaming generation, streaming inputs, and interactive streaming architectures, while a systematic taxonomy is still lacking. This paper provides a comprehensive overview and analysis of streaming LLMs. First, we establish a unified definition of streaming LLMs based on data flow and dynamic interaction to clarify existing ambiguities. Building on this definition, we propose a systematic taxonomy of current streaming LLMs and conduct an in-depth discussion on their underlying methodologies. Furthermore, we explore the applications of streaming LLMs in real-world scenarios and outline promising research directions to support ongoing advances in streaming intelligence. We maintain a continuously updated repository of relevant papers at https://github.com/EIT-NLP/Awesome-Streaming-LLMs.

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

From Static Inference to Dynamic Interaction: Navigating the Landscape of Streaming Large Language Models

Related Papers