Search papers, labs, and topics across Lattice.
The paper introduces the Approximate Triangular Maximally Filtered Graph (a-TMFG) algorithm to address the scalability limitations of traditional TMFG construction, which requires storing a dense correlation matrix. A-TMFG leverages k-Nearest Neighbors Graphs (kNNG) for initial graph construction and employs a memory management strategy to estimate missing correlations on-the-fly. Experiments on datasets with millions of observations demonstrate the algorithm's robustness and its potential for constructing graphs in the absence of natural graph structures for supervised and unsupervised learning.
TMFGs can now scale to millions of data points thanks to a-TMFG, which approximates the correlation matrix on-the-fly using kNN graphs and clever memory management.
The traditional Triangular Maximally Filtered Graph (TMFG) construction requires pre-computation and storage of a dense correlation matrix; this limits its applicability to small and medium-sized datasets. Here we identify key memory and runtime complexity challenges when using TMFG at scale. We then present the Approximate Triangular Maximally Filtered Graph (a-TMFG) algorithm. This is a novel approach to scaling the construction of artificial graphs from data inspired by TMFG. The method employs k-Nearest Neighbors Graphs (kNNG) for initial construction, and implements a memory management strategy to search and estimate missing correlations on-the-fly. This provides representations to control combinatorial explosion. The algorithm is tested for robustness to the parameters and noise, and is evaluated on datasets with millions of observations. This new method provides a parsimonious way to construct graphs for use-cases where graphs are used as input to supervised and unsupervised learning but where no natural graph exists.