Search papers, labs, and topics across Lattice.
This paper introduces LargeMonitor, a novel framework for online task-free continual learning (TFCL) that utilizes large pretrained models to autonomously detect and adapt to distribution shifts in data streams. By employing a decoupled detection module based on frozen large vision models (LVMs) for zero-shot drift detection, and a context-aware diagnostic module using large multimodal models (LMMs) for semantic interpretation, LargeMonitor enhances the adaptability of continual learners without the pitfalls of training-dependent methods. Experimental results show that LargeMonitor significantly improves the performance of existing online TFCL algorithms across various benchmarks by enabling precise detection and diagnosis of complex data variations.
LargeMonitor achieves robust, zero-shot drift detection and dynamic adaptation in online continual learning, outperforming traditional methods that are blind to the structural causes of data shifts.
Online task-free continual learning (TFCL) requires intelligent agents to sequentially accumulate knowledge from an unbounded, non-stationary data stream under strict single-pass constraints and without any explicit task identifiers. Existing online TFCL paradigms primarily rely on parameter-efficient prompt tuning or dynamic structure expansion driven by training-coupled optimization dynamics, such as empirical loss fluctuations or evolving latent distances. As a result, these training-coupled solvers remain agnostic to the structural origins of distribution drift, mechanically enforcing a fixed strategy across fundamentally distinct streaming variations. To address this gap, we propose LargeMonitor, a framework that leverages large pretrained foundation models to autonomously orchestrate task-free continuous adaptation. Specifically, LargeMonitor introduces a decoupled detection module utilizing the frozen, stable representation space of large vision models (LVMs) to achieve robust, zero-shot drift detection without training-dependent interference or brittle threshold tuning. Upon a confirmed drift, the framework activates a context-aware diagnostic module driven by large multimodal models (LMMs) to interpret the precise semantic etiologies of the stream variation (e.g., novel class emergence vs. environmental domain shift). This dual-stage capability empowers the continuous learner to dynamically deploy adaptive and shift-specific optimization strategies. Extensive experiments across multiple TFCL settings and benchmarks demonstrate that LargeMonitor achieves precise, robust detection and diagnosis of complex data streams while consistently improving the performance of existing online TFCL algorithms.