Search papers, labs, and topics across Lattice.
This paper tackles domain-class incremental learning for VLMs by introducing Dynamic Prefix Weighting (DPW), a method that dynamically adjusts prefix weights based on input token importance. DPW uses a gating module to modulate prefix weights and a weighting mechanism to derive adapter output weights as a residual of prefix-tuning weights. Experiments show DPW achieves state-of-the-art performance in domain-class incremental learning scenarios, outperforming methods that normalize prefix weights.
VLMs learn faster and better when you dynamically weight the prefixes based on input token importance, rather than treating all tokens equally.
We investigate recently introduced domain-class incremental learning scenarios for vision-language models (VLMs). Recent works address this challenge using parameter-efficient methods, such as prefix-tuning or adapters, which facilitate model adaptation to downstream tasks by incorporating task-specific information into input tokens through additive vectors. However, previous approaches often normalize the weights of these vectors, disregarding the fact that different input tokens require different degrees of adjustment. To overcome this issue, we propose Dynamic Prefix Weighting (DPW), a framework that dynamically assigns weights to prefixes, complemented by adapters. DPW consists of 1) a gating module that adjusts the weights of each prefix based on the importance of the corresponding input token, and 2) a weighting mechanism that derives adapter output weights as a residual of prefix-tuning weights, ensuring that adapters are utilized only when necessary. Experimental results demonstrate that our method achieves state-of-the-art performance in domain-class incremental learning scenarios for VLMs. The code is available at: https://github.com/YonseiML/dpw.