Search papers, labs, and topics across Lattice.
This paper tackles the problem of estimating the number of asymmetric communities in multi-layer directed networks by proposing a goodness-of-fit test based on the multi-layer stochastic co-block model. The test statistic leverages the deviation of the largest singular value of an aggregated normalized residual matrix from 2, exhibiting a dichotomy between correct model specification and underfitting. The authors develop sequential and ratio-based testing procedures that consistently determine the true numbers of sender and receiver communities by searching for the smallest pair of community numbers where the test statistic drops below a threshold or detects sharp changes.
Accurately estimate the number of sender and receiver communities in multi-layer directed networks, even when they differ, using a novel goodness-of-fit test.
Estimating the asymmetric numbers of communities in multi-layer directed networks is a challenging problem due to the multi-layer structures and inherent directional asymmetry, leading to possibly different numbers of sender and receiver communities. This work addresses this issue under the multi-layer stochastic co-block model, a model for multi-layer directed networks with distinct community structures in sending and receiving sides, by proposing a novel goodness-of-fit test. The test statistic relies on the deviation of the largest singular value of an aggregated normalized residual matrix from the constant 2. The test statistic exhibits a sharp dichotomy: Under the null hypothesis of correct model specification, its upper bound converges to zero with high probability; under underfitting, the test statistic itself diverges to infinity. With this property, we develop a sequential testing procedure that searches through candidate pairs of sender and receiver community numbers in a lexicographic order. The process stops at the smallest such pair where the test statistic drops below a decaying threshold. For robustness, we also propose a ratio-based variant algorithm, which detects sharp changes in the sequence of test statistics by comparing consecutive candidates. Both methods are proven to consistently determine the true numbers of sender and receiver communities under the multi-layer stochastic co-block model.