ZJUFeb 22, 2026arXiv:2602.19089

Ani3DHuman: Photorealistic 3D Human Animation with Self-guided Stochastic Sampling

Qi Sun, Can Wang, Jiaxiang Shang, Yingchun Liu, Jing Liao

AI Summary

Ani3DHuman is introduced, a framework combining kinematics-based animation with video diffusion priors to generate photorealistic 3D human animations. The method disentangles rigid and non-rigid motion using a layered representation, guiding a video diffusion model with coarse kinematic renderings to restore non-rigid details. To address the challenge of out-of-distribution initial renderings, a self-guided stochastic sampling method is proposed, balancing photorealistic quality with identity fidelity during diffusion.

Key Contribution

Achieve photorealistic 3D human animation by guiding video diffusion with kinematic renderings and a novel self-guided stochastic sampling method that overcomes out-of-distribution challenges.

Abstract

Current 3D human animation methods struggle to achieve photorealism: kinematics-based approaches lack non-rigid dynamics (e.g., clothing dynamics), while methods that leverage video diffusion priors can synthesize non-rigid motion but suffer from quality artifacts and identity loss. To overcome these limitations, we present Ani3DHuman, a framework that marries kinematics-based animation with video diffusion priors. We first introduce a layered motion representation that disentangles rigid motion from residual non-rigid motion. Rigid motion is generated by a kinematic method, which then produces a coarse rendering to guide the video diffusion model in generating video sequences that restore the residual non-rigid motion. However, this restoration task, based on diffusion sampling, is highly challenging, as the initial renderings are out-of-distribution, causing standard deterministic ODE samplers to fail. Therefore, we propose a novel self-guided stochastic sampling method, which effectively addresses the out-of-distribution problem by combining stochastic sampling (for photorealistic quality) with self-guidance (for identity fidelity). These restored videos provide high-quality supervision, enabling the optimization of the residual non-rigid motion field. Extensive experiments demonstrate that \MethodName can generate photorealistic 3D human animation, outperforming existing methods. Code is available in https://github.com/qiisun/ani3dhuman.

Computer Vision Multimodal Models

Citation Metrics

Citations0

Influential citations0

References87

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Ani3DHuman: Photorealistic 3D Human Animation with Self-guided Stochastic Sampling

Related Papers