Search papers, labs, and topics across Lattice.
The paper introduces Loss-Adaptive Capacity Expansion (LACE), an online method for dynamically expanding a model's capacity in continual learning based on loss signal deviations. LACE adds new dimensions to the projection layer when the loss exceeds a threshold, indicating insufficient capacity for new data. Experiments show LACE accurately identifies domain boundaries, matches the performance of large fixed-capacity models with fewer initial parameters, and creates adapter dimensions critical for performance, all without labels, replay buffers, or external controllers.
Models can dynamically grow their own capacity during continual learning, adding parameters only when and where they're needed, without human intervention.
Fixed representational capacity is a fundamental constraint in continual learning: practitioners must guess an appropriate model width before training, without knowing how many distinct concepts the data contains. We propose LACE (Loss-Adaptive Capacity Expansion), a simple online mechanism that expands a model's representational capacity during training by monitoring its own loss signal. When sustained loss deviation exceeds a threshold - indicating that the current capacity is insufficient for newly encountered data - LACE adds new dimensions to the projection layer and trains them jointly with existing parameters. Across synthetic and real-data experiments, LACE triggers expansions exclusively at domain boundaries (100% boundary precision, zero false positives), matches the accuracy of a large fixed-capacity model while starting from a fraction of its dimensions, and produces adapter dimensions that are collectively critical to performance (3% accuracy drop when all adapters removed). We further demonstrate unsupervised domain separation in GPT-2 activations via layer-wise clustering, showing a U-shaped separability curve across layers that motivates adaptive capacity allocation in deep networks. LACE requires no labels, no replay buffers, and no external controllers, making it suitable for on-device continual learning under resource constraints.