Search papers, labs, and topics across Lattice.
The paper introduces Selective Synchronization Attention (SSA), a novel attention mechanism inspired by the Kuramoto model of coupled oscillators, to address the quadratic complexity and biological implausibility of standard dot-product attention. SSA represents tokens as oscillators with learnable frequencies and phases, using synchronization strength as attention weights, achieving sparsity through a phase-locking threshold. The authors instantiate SSA within the Oscillatory Synchronization Network (OSN) and demonstrate that it exhibits a stronger architectural inductive bias than standard Transformers due to non-uniform coupling patterns.
Ditch the dot product: this new attention mechanism uses coupled oscillators to achieve sparsity and encode position, all in a single pass.
The Transformer architecture has become the foundation of modern deep learning, yet its core self-attention mechanism suffers from quadratic computational complexity and lacks grounding in biological neural computation. We propose Selective Synchronization Attention (SSA), a novel attention mechanism that replaces the standard dot-product self-attention with a closed-form operator derived from the steady-state solution of the Kuramoto model of coupled oscillators. In SSA, each token is represented as an oscillator characterized by a learnable natural frequency and phase; the synchronization strength between token pairs, determined by a frequency-dependent coupling and phase-locking condition, serves as the attention weight. This formulation provides three key advantages: (i) natural sparsity arising from the phase-locking threshold, whereby tokens with incompatible frequencies automatically receive zero attention weight without explicit masking; (ii) unified positional-semantic encoding through the natural frequency spectrum, eliminating the need for separate positional encodings; and (iii) a single-pass, closed-form computation that avoids iterative ODE integration, with all components (coupling, order parameter, synchronization) derived from the oscillatory framework. We instantiate SSA within the Oscillatory Synchronization Network (OSN), a drop-in replacement for the Transformer block. Analysis of the synchronization matrices reveals non-uniform, head-diverse coupling patterns even at initialization, demonstrating a stronger architectural inductive bias than the approximately uniform attention produced by randomly initialized Transformers.