Speech rate perception and interlocutor identification in human-directed vs. device-directed speech
DOI:
https://doi.org/10.36505/TheLinguisticProceedings/2025/16/01/001/000661Keywords:
human-directed speech, speech perception, device-directed speech, speech rate, interlocutor identificationAbstract
This study investigates how listeners perceive differences between human-directed and device-directed speech, focusing on speech rate and interlocutor identification. Seventy-eight native Arabic speakers (aged 19–22; M = 20.46, SD = 1.11) participated in two tasks: rating the speed of 30 short recordings and determining whether each sample was directed towards a person or a device. The results showed that device-directed speech was consistently perceived as faster, while human-directed speech enabled more accurate interlocutor identification. Statistical analyses confirmed that these differences were significant, with moderate effect sizes. The findings suggest that devices produce speech efficiently but lack the natural variability that characterises human communication. Incorporating more dynamic and expressive features into voice systems could improve user engagement. Future research should consider cultural differences and emotional tone in shaping speech perception.
References
Aldholmi, Y., Aldhafyan, R., & Alqahtani, A. (2021). Perception of Standard Arabic synthetic speech rate. Interspeech 2021, 1704–1707. https://doi.org/10.21437/Interspeech.2021-39
Huiyang, S., & Min, W. (2022). Improving interaction experience through lexical convergence: The prosocial effect of lexical alignment in human-human and human-computer interactions. International Journal of Human-Computer Interaction, 38(1), 28–41. https://doi.org/10.1080/10447318.2021.1921367
Jones, C., Berry, L., & Stevens, C. (2007). Synthesized speech intelligibility and persuasion: Speech rate and non-native listeners. Computer Speech & Language, 21(4), 641–651. https://doi.org/10.1016/j.csl.2007.03.001
Vonessen, J., Aoki, N. B., Cohn, M., & Zellou, G. (2024). Comparing perception of L1 and L2 English by human listeners and machines: Effect of interlocutor adaptations. Journal of the Acoustical Society of America, 155(5), 3060–3070. https://doi.org/10.1121/10.0025930
Zellou, G., Cohn, M., & Pycha, A. (2023). Listener beliefs and perceptual learning: Differences between device and human guises. Language, 99(4), 692–725. https://doi.org/10.1353/lan.2023.a914191
Downloads
Published
Issue
Section
License
Articles are published under the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.