Search papers, labs, and topics across Lattice.
This paper introduces a hybrid architecture, Neck-Learn, for detecting vocal hyperfunction from week-long neck-surface accelerometer recordings. Neck-Learn combines gradient-boosted trees on day-level distributional features with a CNN-based multiple instance learning (MIL) framework to preserve and learn from temporal dynamics within each day. Results on a held-out test set demonstrate state-of-the-art performance, exceeding challenge baselines with AUCs of 0.879 for PVH and 0.848 for NPVH, while also providing clinically relevant insights.
Neck-Learn's hybrid architecture, combining gradient-boosted trees and CNN-based multiple instance learning, unlocks improved ambulatory detection of vocal hyperfunction by preserving crucial temporal dynamics in voice data.
Vocal hyperfunction (VH) is a prevalent voice disorder whose ambulatory detection remains challenging despite extensive daily voice data. Prior approaches capture week-long neck-surface accelerometer recordings but collapse them into fixed-length subject-level feature vectors, discarding within-day temporal dynamics encoding nuanced voicing feature interactions. We introduce a novel hybrid architecture combining gradient-boosted trees on day-level distributional features with a CNN-based multiple instance learning (MIL) framework that preserves and learns from from temporal dynamics throughout each day. On the held-out test set, our model exceeds the challenge baselines (AUC: 0.82 PVH, 0.77 NPVH), achieving AUCs of 0.879 for PVH (Rank 5) and 0.848 for NPVH (Rank 3), while also providing insights into clinically relevant information about both pathologies.