ETHANUSydneyApr 29, 2026arXiv:2604.26555

FloatSOM: GPU-Accelerated, Distributed, Topology-Flexible Self-Organizing Maps

Tony Xu, Sarah Klamt, Katherine Turner, Anne Brustle, Felix Marsh-Wakefield, Givanna Putri

AI Summary

FloatSOM is introduced as a scalable SOM framework that supports multi-GPU execution, out-of-memory streaming, and flexible topologies beyond regular lattices to address the limitations of existing GPU-accelerated SOM implementations with growing dataset sizes. The framework's improved topologies, combined with topology-aware hyperparameter fine-tuning, achieve lower quantization error than state-of-the-art SOM baselines across 14 datasets. At scale, FloatSOM trains a 1024-node SOM network on 1 billion samples with 50 features in 6.16 minutes using 8 GPUs across two nodes.

Key Contribution

Training a 1024-node SOM on a billion-sample dataset in just over 6 minutes shatters previous scalability limits, thanks to a novel framework that leverages multi-GPU execution, out-of-memory streaming, and flexible topologies.

Abstract

GPU-accelerated Self-Organizing Map (SOM) implementations are among the most competitive options for large-scale SOM analysis, but growing dataset sizes increasingly challenge their practical use because workloads no longer fit cleanly within device-memory limits. We introduce FloatSOM, a SOM framework for scalable training and deployment that supports multi-GPU execution, out-of-memory disk-backed streaming, and novel topologies beyond regular lattices. We evaluate FloatSOM on 14 synthetic and real benchmark datasets together with controlled speed scaling benchmarks, and show that these improved topologies, combined with topology-aware hyperparameter fine-tuning, yield lower quantization error than current state-of-the-art SOM baselines. FloatSOM also sustains this performance at large scale with high-throughput distributed execution; in the largest benchmark, it trains a 1024-node SOM network on 1,000,000,000 samples with 50 features in 6.16 minutes on 8 GPUs across two separate high-performance-computing nodes.

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

FloatSOM: GPU-Accelerated, Distributed, Topology-Flexible Self-Organizing Maps

Related Papers