Kai Han

×10−51\times 10^{-5}, and a gradient accumulation factor of 4 was used to achieve an effective batch size of 32. 4 Experiments We evaluate our sparse models against their dense counterparts and two training-free baselines: FastVGGT [33] and Block-Sparse VGGT [37]. Variants of these baselines with VGGT/π3\pi^{3} are referred to as FastVGGT-VGGT/π3\pi^{3} and Block-Sparse VGGT/π3\pi^{3}, respectively. Unless stated otherwise, we use following parameters. Our method employs a 4x4 compression window and selects the top-32 blocks for selective attention. For the baselines, we adopt their default configurations: a 0.9 merge ratio for FastVGGT [33] and a 0.75 sparsity ratio for Block-Sparse VGGT/π3\pi^{3} [37]. All inference times are benchmarked on a single H100 GPU. 4.1 Two-view Pose Estimation Table 1: Pair-wise pose results on ScanNet-1500 [7, 29]. We report the Area Under the Curve (AUC) of the pose error at different thresholds. Best results per backbone are marked in bold. Methods ScanNet1500 AUC@5 ↑\uparrow AUC@10 ↑\uparrow AUC@20 ↑\uparrow VGGT [40] 37.45 59.24 75.69

Papers on Lattice

Total citations

Topics

h-index