Search papers, labs, and topics across Lattice.
This paper introduces DynaTok, a novel framework for 4D reconstruction from incomplete and unordered point cloud sequences, addressing the limitations of existing methods that rely on images or assume complete inputs. By encoding frames into latent tokens and utilizing a Transformer-based spatiotemporal encoder, DynaTok effectively aggregates partial observations and decouples geometry from motion. Experimental results show that DynaTok significantly enhances reconstruction quality and temporal coherence compared to traditional point-based approaches.
DynaTok achieves superior 4D reconstruction from incomplete point clouds, improving both quality and temporal coherence without relying on images or explicit correspondences.
We address 4D reconstruction from partial point cloud sequences, where depth-sensor observations are incomplete, unordered, and lack explicit temporal correspondences. This geometry-only setting is challenging due to missing observations and ambiguous dynamics. While recent progress has largely relied on image-based methods, existing point-based approaches typically focus on single objects, assume relatively complete inputs, or require explicit correspondences. To address these limitations, we propose DynaTok, a point-based framework for correspondence-free 4D reconstruction from partial point cloud sequences without images. DynaTok encodes frames into compact latent tokens, aggregates incomplete observations over time with a Transformer-based spatiotemporal encoder, and decouples geometry and motion through residual tokens in a unified model. A flow-matching decoder then reconstructs complete, temporally consistent 4D point-cloud sequences conditioned on the latent tokens. Experiments on object- and scene-level benchmarks demonstrate improved reconstruction quality and temporal coherence from partial point cloud observations. Project page: https://wrchen530.github.io/dynatok/.