KTHMar 30, 2026arXiv:2603.28338

Users and Wizards in Conversations: How WoZ Interface Choices Define Human-Robot Interactions

Ekaterina Torubarova, Jura Miniota, André Pereira, Andre Pereira

AI Summary

This paper examines the impact of different Wizard-of-Oz (WoZ) interfaces on human-robot interaction during conversations, comparing a restricted GUI, an unrestricted GUI, and a VR telepresence interface. They found that users preferred the VR interface due to increased perceived social presence and desirable robot features. Wizards using the VR interface reported a stronger social connection with users, despite it being the most demanding interface, and the VR interface resulted in more fluid, connected conversations.

Key Contribution

VR telepresence in Wizard-of-Oz studies doesn't just feel more immersive, it fundamentally changes the interaction dynamics, fostering stronger social connections and more natural conversational flow compared to traditional GUI-based interfaces.

Abstract

In this paper, we investigated how the choice of a Wizard-of-Oz (WoZ) interface affects communication with a robot from both the user's and the wizard's perspective. In a conversational setting, we used three WoZ interfaces with varying levels of dialogue input and output restrictions: a) a restricted perception GUI that showed fixed-view video and ASR transcripts and let the wizard trigger pre-scripted utterances and gestures; b) an unrestricted perception GUI that added real-time audio from the participant and the robot c) a VR telepresence interface that streamed immersive stereo video and audio to the wizard and forwarded the wizard's spontaneous speech, gaze and facial expressions to the robot. We found that the interaction mediated by the VR interface was preferred by users in terms of robot features and perceived social presence. For the wizards, the VR condition turned out to be the most demanding but elicited a higher social connection with the users. VR interface also induced the most connected interaction in terms of inter-speaker gaps and overlaps, while Restricted GUI induced the least connected flow and the largest silences. Given these results, we argue for more WoZ studies using telepresence interfaces. These studies better reflect the robots of tomorrow and offer a promising path to automation based on naturalistic contextualized verbal and non-verbal behavioral data.

Natural Language Processing Robotics & Embodied AI Speech & Audio

Citation Metrics

Citations0

Influential citations0

References0

Year2025

VenueRobotics

Related Papers

Finding related papers...

Search

Users and Wizards in Conversations: How WoZ Interface Choices Define Human-Robot Interactions

Related Papers