Search papers, labs, and topics across Lattice.
NasoVoce, a nose-mounted interface integrating a microphone and vibration sensor, captures both acoustic and vibration signals from the nasal bridge to enable always-available speech interaction. By fusing these complementary inputs, the system enhances speech recognition accuracy and robustness against environmental noise compared to using either sensor alone. Evaluations using Whisper Large-v2, PESQ, STOI, and MUSHRA demonstrate improved recognition and speech quality, indicating the potential for practical, discreet AI voice conversations.
A nose-mounted microphone and vibration sensor combo unlocks robust, low-audibility speech interfaces for always-on AI interaction, even in noisy environments.
Silent and whispered speech offer promise for always-available voice interaction with AI, yet existing methods struggle to balance vocabulary size, wearability, silence, and noise robustness. We present NasoVoce, a nose-bridge-mounted interface that integrates a microphone and a vibration sensor. Positioned at the nasal pads of smart glasses, it unobtrusively captures both acoustic and vibration signals. The nasal bridge, close to the mouth, allows access to bone- and skin-conducted speech and enables reliable capture of low-volume utterances such as whispered speech. While the microphone captures high-quality audio, it is highly sensitive to environmental noise. Conversely, the vibration sensor is robust to noise but yields lower signal quality. By fusing these complementary inputs, NasoVoce generates high-quality speech robust against interference. Evaluation with Whisper Large-v2, PESQ, STOI, and MUSHRA ratings confirms improved recognition and quality. NasoVoce demonstrates the feasibility of a practical interface for always-available, continuous, and discreet AI voice conversations.