Tsinghua AICentral South UniversityUniversity of Science and TechnologyApr 13, 2026arXiv:2604.11156

rPPG-VQA: A Video Quality Assessment Framework for Unsupervised rPPG Training

Tianyang Dai, Ming Chang, Yan Chen, Yang Hu

AI Summary

This paper introduces rPPG-VQA, a framework to assess video suitability for unsupervised remote photoplethysmography (rPPG) training by integrating signal-level SNR estimation and scene-level interference detection using a multimodal large language model. They propose a dual-branch architecture and a two-stage adaptive sampling strategy to curate high-quality training datasets from "in-the-wild" videos. Experiments demonstrate that rPPG models trained on data filtered by rPPG-VQA achieve significantly improved accuracy on standard benchmarks.

Key Contribution

Training rPPG models on videos pre-screened by a multimodal LLM for signal quality and scene interference yields a substantial accuracy boost, finally unlocking the potential of unsupervised rPPG on "in-the-wild" data.

Abstract

Unsupervised remote photoplethysmography (rPPG) promises to leverage unlabeled video data, but its potential is hindered by a critical challenge: training on low-quality "in-the-wild" videos severely degrades model performance. An essential step missing here is to assess the suitability of the videos for rPPG model learning before using them for the task. Existing video quality assessment (VQA) methods are mainly designed for human perception and not directly applicable to the above purpose. In this work, we propose rPPG-VQA, a novel framework for assessing video suitability for rPPG. We integrate signal-level and scene-level analyses and design a dual-branch assessment architecture. The signal-level branch evaluates the physiological signal quality of the videos via robust signal-to-noise ratio (SNR) estimation with a multi-method consensus mechanism, and the scene-level branch uses a multimodal large language model (MLLM) to identify interferences like motion and unstable lighting. Furthermore, we propose a two-stage adaptive sampling (TAS) strategy that utilizes the quality score to curate optimal training datasets. Experiments show that by training on large-scale, "in-the-wild" videos filtered by our framework, we can develop unsupervised rPPG models that achieve a substantial improvement in accuracy on standard benchmarks. Our code is available at https://github.com/Tianyang-Dai/rPPG-VQA.

Computer Vision Data Curation & Synthetic Data

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

rPPG-VQA: A Video Quality Assessment Framework for Unsupervised rPPG Training

Related Papers