Speech rate perception and interlocutor identification in human-directed vs. device-directed speech

Authors

  • Yahya Aldholmi King Saud University, Saudi Arabia Author
  • May Al-Sager King Saud University, Saudi Arabia Author
  • Arwa Alsahafi King Saud University, Saudi Arabia Author
  • Reema Alshiddi King Saud University, Saudi Arabia Author

DOI:

https://doi.org/10.36505/TheLinguisticProceedings/2025/16/01/001/000661

Keywords:

human-directed speech, speech perception, device-directed speech, speech rate, interlocutor identification

Abstract

This study investigates how listeners perceive differences between human-directed and device-directed speech, focusing on speech rate and interlocutor identification. Seventy-eight native Arabic speakers (aged 19–22; M = 20.46, SD = 1.11) participated in two tasks: rating the speed of 30 short recordings and determining whether each sample was directed towards a person or a device. The results showed that device-directed speech was consistently perceived as faster, while human-directed speech enabled more accurate interlocutor identification. Statistical analyses confirmed that these differences were significant, with moderate effect sizes. The findings suggest that devices produce speech efficiently but lack the natural variability that characterises human communication. Incorporating more dynamic and expressive features into voice systems could improve user engagement. Future research should consider cultural differences and emotional tone in shaping speech perception.

References

Aldholmi, Y., Aldhafyan, R., & Alqahtani, A. (2021). Perception of Standard Arabic synthetic speech rate. Interspeech 2021, 1704–1707. https://doi.org/10.21437/Interspeech.2021-39

Huiyang, S., & Min, W. (2022). Improving interaction experience through lexical convergence: The prosocial effect of lexical alignment in human-human and human-computer interactions. International Journal of Human-Computer Interaction, 38(1), 28–41. https://doi.org/10.1080/10447318.2021.1921367

Jones, C., Berry, L., & Stevens, C. (2007). Synthesized speech intelligibility and persuasion: Speech rate and non-native listeners. Computer Speech & Language, 21(4), 641–651. https://doi.org/10.1016/j.csl.2007.03.001

Vonessen, J., Aoki, N. B., Cohn, M., & Zellou, G. (2024). Comparing perception of L1 and L2 English by human listeners and machines: Effect of interlocutor adaptations. Journal of the Acoustical Society of America, 155(5), 3060–3070. https://doi.org/10.1121/10.0025930

Zellou, G., Cohn, M., & Pycha, A. (2023). Listener beliefs and perceptual learning: Differences between device and human guises. Language, 99(4), 692–725. https://doi.org/10.1353/lan.2023.a914191

Downloads

Published

01-09-2025

How to Cite

Speech rate perception and interlocutor identification in human-directed vs. device-directed speech. (2025). Linguistic Proceedings Series, 16(1), 1-4. https://doi.org/10.36505/TheLinguisticProceedings/2025/16/01/001/000661

Share

Similar Articles

1-10 of 229

You may also start an advanced similarity search for this article.