Search papers, labs, and topics across Lattice.
This paper introduces Kuramoto Oscillatory Phase Encoding (KoPE), a novel method that augments Vision Transformers with an evolving phase state inspired by neural synchronization. KoPE enhances learning efficiency by promoting synchronization-enhanced structure learning, leading to improvements in training, parameter, and data efficiency. Experiments demonstrate KoPE's effectiveness across various tasks, including semantic/panoptic segmentation, representation alignment with language, and few-shot abstract visual reasoning, suggesting its potential as a scalable mechanism for advancing neural network models.
Neural synchronization, long hypothesized to support flexible coordination in biological brains, can now be harnessed to improve the learning efficiency of Vision Transformers.
Spatiotemporal neural dynamics and oscillatory synchronization are widely implicated in biological information processing and have been hypothesized to support flexible coordination such as feature binding. By contrast, most deep learning architectures represent and propagate information through activation values, neglecting the joint dynamics of rate and phase. In this work, we introduce Kuramoto oscillatory Phase Encoding (KoPE) as an additional, evolving phase state to Vision Transformers, incorporating a neuro-inspired synchronization mechanism to advance learning efficiency. We show that KoPE can improve training, parameter, and data efficiency of vision models through synchronization-enhanced structure learning. Moreover, KoPE benefits tasks requiring structured understanding, including semantic and panoptic segmentation, representation alignment with language, and few-shot abstract visual reasoning (ARC-AGI). Theoretical analysis and empirical verification further suggest that KoPE can accelerate attention concentration for learning efficiency. These results indicate that synchronization can serve as a scalable, neuro-inspired mechanism for advancing state-of-the-art neural network models.