Search papers, labs, and topics across Lattice.
This paper introduces a novel square superpixel generation method using granular ball computing to address the limitations of irregularly shaped superpixels in vision tasks. The approach approximates superpixels with multi-scale square blocks, selected based on a purity score derived from pixel-intensity similarity, enabling efficient parallel processing and learnable feature extraction. Experiments on downstream tasks show performance improvements, demonstrating the effectiveness of integrating these square superpixels into GNNs and ViTs for multi-scale information aggregation.
Square superpixels, generated via granular ball computing, unlock efficient parallel processing and end-to-end optimization in deep learning pipelines by replacing irregular shapes with multi-scale square blocks.
Superpixels provide a compact region-based representation that preserves object boundaries and local structures, and have therefore been widely used in a variety of vision tasks to reduce computational cost. However, most existing superpixel algorithms produce irregularly shaped regions, which are not well aligned with regular operators such as convolutions. Consequently, superpixels are often treated as an offline preprocessing step, limiting parallel implementation and hindering end-to-end optimization within deep learning pipelines. Motivated by the adaptive representation and coverage property of granular-ball computing, we develop a square superpixel generation approach. Specifically, we approximate superpixels using multi-scale square blocks to avoid the computational and implementation difficulties induced by irregular shapes, enabling efficient parallel processing and learnable feature extraction. For each block, a purity score is computed based on pixel-intensity similarity, and high-quality blocks are selected accordingly. The resulting square superpixels can be readily integrated as graph nodes in graph neural networks (GNNs) or as tokens in Vision Transformers (ViTs), facilitating multi-scale information aggregation and structured visual representation. Experimental results on downstream tasks demonstrate consistent performance improvements, validating the effectiveness of the proposed method.