Mar 12, 2026arXiv:2603.12067

Beyond Convolution: A Taxonomy of Structured Operators for Learning-Based Image Processing

AI Summary

This paper introduces a taxonomy of structured operators that go beyond standard convolution for learning-based image processing. The taxonomy categorizes operators into five families: decomposition-based, adaptive weighted, basis-adaptive, integral/kernel, and attention-based operators, each offering different structural properties compared to convolution. By analyzing these operators across dimensions like linearity, locality, and computational cost, the paper provides guidance on their suitability for various image processing tasks and highlights open research challenges.

Key Contribution

Convolution's reign in image processing may be ending: this paper maps out five families of structured operators that could surpass its limitations in capturing complex image properties.

Abstract

The convolution operator is the fundamental building block of modern convolutional neural networks (CNNs), owing to its simplicity, translational equivariance, and efficient implementation. However, its structure as a fixed, linear, locally-averaging operator limits its ability to capture structured signal properties such as low-rank decompositions, adaptive basis representations, and non-uniform spatial dependencies. This paper presents a systematic taxonomy of operators that extend or replace the standard convolution in learning-based image processing pipelines. We organise the landscape of alternative operators into five families: (i) decomposition-based operators, which separate structural and noise components through singular value or tensor decompositions; (ii) adaptive weighted operators, which modulate kernel contributions as a function of spatial position or signal content; (iii) basis-adaptive operators, which optimise the analysis bases together with the network weights; (iv) integral and kernel operators, which generalise the convolution to position-dependent and non-linear kernels; and (v) attention-based operators, which relax the locality assumption entirely. For each family, we provide a formal definition, a discussion of its structural properties with respect to the convolution, and a critical analysis of the tasks for which the operator is most appropriate. We further provide a comparative analysis of all families across relevant dimensions -- linearity, locality, equivariance, computational cost, and suitability for image-to-image and image-to-label tasks -- and outline the open challenges and future directions of this research area.

Architecture Design (Transformers, SSMs, MoE)Computer Vision Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References43

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Beyond Convolution: A Taxonomy of Structured Operators for Learning-Based Image Processing

Related Papers