Search papers, labs, and topics across Lattice.
The paper introduces DRAMA, a framework for efficient multi-domain neural information retrieval that uses domain-specific adapter modules and a dynamic gating mechanism to select relevant domain knowledge for each query. This approach addresses the scalability and sustainability challenges of deploying neural IR models in multi-domain scenarios by reducing computational and energy requirements. Experiments on Web retrieval benchmarks demonstrate that DRAMA achieves comparable effectiveness to domain-specific models with significantly fewer parameters and computational resources.
Slash the environmental footprint of neural IR by adaptively routing queries to lightweight, domain-specific modules, achieving comparable accuracy to full-sized models at a fraction of the cost.
Neural models are increasingly used in Web-scale Information Retrieval (IR). However, relying on these models introduces substantial computational and energy requirements, leading to increasing attention toward their environmental cost and the sustainability of large-scale deployments. While neural IR models deliver high retrieval effectiveness, their scalability is constrained in multi-domain scenarios, where training and maintaining domain-specific models is inefficient and achieving robust cross-domain generalisation within a unified model remains difficult. This paper introduces DRAMA (Domain Retrieval using Adaptive Module Allocation), an energy- and parameter-efficient framework designed to reduce the environmental footprint of neural retrieval. DRAMA integrates domain-specific adapter modules with a dynamic gating mechanism that selects the most relevant domain knowledge for each query. New domains can be added efficiently through lightweight adapter training, avoiding full model retraining. We evaluate DRAMA on multiple Web retrieval benchmarks covering different domains. Our extensive evaluation shows that DRAMA achieves comparable effectiveness to domain-specific models while using only a fraction of their parameters and computational resources. These findings show that energy-aware model design can significantly improve scalability and sustainability in neural IR.