AI vs. human (automatic) speech recognition: silence-replacement paradigm as a diagnostic

Authors

  • Yahya Aldholmi King Saud University, Saudi Arabia Author

DOI:

https://doi.org/10.36505/TheLinguisticProceedings/2025/17/02/002/000688

Keywords:

vowel importance, consonant importance, ASR, English, silence-replacement paradigm

Abstract

This study tests how vowels and consonants contribute to sentence-level word recognition in automatic speech recognition (ASR), using a silence-replacement paradigm modeled on classic human-perception research. I recorded 48 English sentences divided into two sets: 24 with a symmetrical ratio and 24 with an asymmetrical ratio. For each sentence I created two processed versions: CO (consonant-only; vowels replaced by silence) and VO (vowel-only; consonants replaced by silence). I then submitted all stimuli to two state-of-the-art ASR systems, TurboScribe and Whisper, and quantified word recognition as the percentage of original words correctly transcribed. When the material was symmetrical, VO speech outperformed CO speech, mirroring human patterns. However, with asymmetrical material, this advantage reversed dramatically, showing a strong interaction between segment type and stimulus structure.

References

Aldholmi, Y. 2018. Segmental contributions to speech intelligibility in nonconcatenative vs. concatenative languages. Doctoral dissertation, University of Wisconsin–Milwaukee.

Aldholmi, Y., Pycha, A. 2023. Segmental contributions to word recognition in Arabic sentences. Poznan Studies in Contemporary Linguistics, 59(2), 257-287.

Chen, F., Wong, L.L., Wong, E.Y. 2013. Assessing the perceptual contributions of vowels and consonants to Mandarin sentence intelligibility. The Journal of the Acoustical Society of America, 134(2), EL178-EL184.

Chen, F., Wong, M.L., Zhu, S., Wong, L.L. 2015. Relative contributions of vowels and consonants in recognizing isolated Mandarin words. Journal of Phonetics, 52, 26-34.

Cole, R.A., Yan, Y., Mak, B., Fanty, M., Bailey, T. 1996. The contribution of consonants versus vowels to word recognition in fluent speech. In 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, Vol. 2, 853-856. IEEE.

Cutler, A., Sebastián-Gallés, N., Soler-Vilageliu, O., Van Ooijen, B. 2000. Constraints of vowels and consonants on lexical selection: Cross-linguistic comparisons. Memory & Cognition, 28(5), 746-755.

Fogerty, D., Kewley-Port, D., Humes, L.E. 2012. The relative importance of consonant and vowel segments to the recognition of words and sentences: Effects of age and hearing loss. The Journal of the Acoustical Society of America, 132(3), 1667-1678.

Van Ooijen, B. 1996. Vowel mutability and lexical selection in English: Evidence from a word reconstruction task. Memory & Cognition, 24(5), 573-583.

Yan, Y., Chen, F., Li, J. 2025. An overview of the impacts of vowels and consonants in speech understanding and their applications. npj Acoustics, 1, 1-8.

Downloads

Published

01-12-2025

Section

Proceedings Papers

How to Cite

AI vs. human (automatic) speech recognition: silence-replacement paradigm as a diagnostic. (2025). Linguistic Proceedings Series, 5-8. https://doi.org/10.36505/TheLinguisticProceedings/2025/17/02/002/000688

Share

Similar Articles

1-10 of 200

You may also start an advanced similarity search for this article.