Mar 17, 2026arXiv:2603.17061

Collecting Prosody in the Wild: A Content-Controlled, Privacy-First Smartphone Protocol and Empirical Evaluation

Timo K. Koch, Florian Bemmann, Ramona Schoedel, Markus Buehner, Clemens Stachl

AI Summary

This paper introduces a smartphone-based protocol for collecting prosodic speech data in natural settings while addressing semantic confounding and privacy concerns. The protocol uses scripted read-aloud sentences to control lexical content and performs on-device prosodic feature extraction before deleting raw audio. A large-scale deployment (N=560) demonstrates the protocol's feasibility and data quality, validated through speaker sex and affective state prediction tasks.

Key Contribution

A new smartphone protocol enables large-scale, privacy-preserving collection of prosodic speech data in the wild, opening doors to studying the subtle emotional nuances in everyday communication.

Abstract

Collecting everyday speech data for prosodic analysis is challenging due to the confounding of prosody and semantics, privacy constraints, and participant compliance. We introduce and empirically evaluate a content-controlled, privacy-first smartphone protocol that uses scripted read-aloud sentences to standardize lexical content (including prompt valence) while capturing natural variation in prosodic delivery. The protocol performs on-device prosodic feature extraction, deletes raw audio immediately, and transmits only derived features for analysis. We deployed the protocol in a large study (N = 560; 9,877 recordings), evaluated compliance and data quality, and conducted diagnostic prediction tasks on the extracted features, predicting speaker sex and concurrently reported momentary affective states (valence, arousal). We discuss implications and directions for advancing and deploying the protocol.

Data Curation & Synthetic Data Natural Language Processing Speech & Audio

Citation Metrics

Citations0

Influential citations0

References24

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Collecting Prosody in the Wild: A Content-Controlled, Privacy-First Smartphone Protocol and Empirical Evaluation

Related Papers