Search papers, labs, and topics across Lattice.
The paper introduces radVI, a novel variational inference algorithm that optimizes the radial profile of the surrogate distribution to better approximate high-dimensional target distributions. By optimizing directly over radial profiles, radVI addresses the limitations of standard VI methods that often fail to capture the correct radial behavior of the target, leading to improved coverage. The authors provide theoretical convergence guarantees based on Wasserstein space optimization and radial transport map regularity.
RadVI offers a simple add-on that significantly improves variational inference by optimizing the radial profile of the surrogate distribution, overcoming limitations of standard Gaussian approximations.
In variational inference (VI), the practitioner approximates a high-dimensional distribution $π$ with a simple surrogate one, often a (product) Gaussian distribution. However, in many cases of practical interest, Gaussian distributions might not capture the correct radial profile of $π$, resulting in poor coverage. In this work, we approach the VI problem from the perspective of optimizing over these radial profiles. Our algorithm radVI is a cheap, effective add-on to many existing VI schemes, such as Gaussian (mean-field) VI and Laplace approximation. We provide theoretical convergence guarantees for our algorithm, owing to recent developments in optimization over the Wasserstein space--the space of probability distributions endowed with the Wasserstein distance--and new regularity properties of radial transport maps in the style of Caffarelli (2000).