Search papers, labs, and topics across Lattice.
This survey paper reviews recent advances in Transformer optimization techniques, addressing challenges related to computational efficiency, parameter-efficient fine-tuning, external knowledge integration, and multimodal fusion. It analyzes representative methods within each area, highlighting underlying principles and experimental results. The paper identifies key challenges and proposes future research directions, including unified efficient attention theories and trustworthy knowledge injection mechanisms.
Transformer optimization is about more than just speed: it's also about injecting knowledge, adapting to new tasks, and fusing modalities, revealing a rich landscape of techniques beyond simple efficiency gains.
Since its proposal in 2017, the Transformer model has achieved revolutionary breakthroughs in natural language processing and even in computer vision tasks. However, its huge number of parameters and high computational complexity have posed substantial difficulties in training and inference efficiency, model knowledge updating, and multimodal information fusion. This paper reviews recent research progress on Transformer optimization techniques, including: (1) Structural optimization and computational efficiency model architecture improvements, pruning compression, and efficient attention mechanisms to reduce computational cost; (2) Parameter-efficient fine-tuning and task adaptation new fine-tuning methods with high parameter efficiency, as well as few-shot and zero-shot learning paradigms, to improve adaptability in low-resource and multi-task scenarios; (3) External knowledge integration incorporating knowledge graphs, retrieval-based external memory, etc., into Transformers to fill knowledge gaps and enhance commonsense reasoning; (4) Multimodal fusion designing cross-modal Transformer architectures and alignment mechanisms to effectively fuse information from modalities such as vision and language. Analyze representative methods, underlying principles, and experimental results for each direction, discuss the main challenges, and predict forthcoming research directions, such as unified efficient attention theories, trustworthy knowledge injection mechanisms, green AI training strategies, and next-generation interpretable Transformer architectures.