Search papers, labs, and topics across Lattice.
This paper introduces MDA, a mixture-density representation for depth estimation that addresses the issue of flying points by allowing each pixel to predict multiple depth hypotheses rather than a single depth value. By accommodating the ambiguity at object boundaries, MDA significantly enhances boundary reconstruction and effectively eliminates spurious 3D points, even under challenging conditions such as severe input blur. The method shows consistent improvements across various backbone models and extends its utility to transparent objects and sky regions, demonstrating its versatility in complex depth estimation scenarios.
Flying points are virtually eliminated in depth estimation by allowing models to predict multiple hypotheses for ambiguous pixels at boundaries.
Despite advances in depth estimation, flying points remain a persistent failure mode: near object boundaries, depth estimators often predict spurious 3D points in the empty space between foreground and background surfaces. We trace this artifact to a standard modeling choice: assigning each pixel a single depth hypothesis. At boundaries, a pixel can straddle a foreground and a background surface, so its true depth is ambiguous between the two. A model that predicts a single depth cannot keep both possibilities, so training instead pulls the prediction toward an intermediate depth that lies on neither surface. We address this with MDA, a mixture-density representation that lets the model predict multiple depth hypotheses and their associated probabilities for each pixel. Near boundaries, different hypotheses can align with different surfaces, and the decoded depth is selected from one of these hypotheses rather than placed in the empty space between them. Across different backbones, MDA substantially improves boundary reconstruction and largely removes flying-point artifacts even under severe input blur, while adding negligible runtime overhead. The same mixture-density framework naturally extends to transparent objects, where it predicts multiple depth layers at transparent pixels, and to sky regions, where a dedicated component separates the unbounded sky from finite-depth regions, producing flying-point-free skylines. Project Page: https://biansy000.github.io/mda-site/.