Improving intelligibility of time-scale compressed speech for visually impaired and sighted listeners

Authors

  • Panagiotis Pantalos University of Crete, Greece Author
  • George P. Katzentzis Author
  • Anna Sfakianaki University of Ioannina, Greece Author
  • Yiannis Stylianou Author

DOI:

https://doi.org/10.36505/TheLinguisticProceedings/2025/17/02/016/000702

Keywords:

speech, transformations, intelligibility, visual-impairment, perception

Abstract

Time-scale compression enables faster speech playback but often reduces intelligibility, especially under high compression rates where non-stationary speech components are distorted. This work investigates improving intelligibility for visually impaired and sighted listeners by protecting non-stationary regions during time-compression. Using Waveform Similarity Overlap Add (WSOLA), we propose a protection method that adapts scale factors based on three non-stationarity criteria derived from Root-Mean-Square (RMS) energy and Line Spectrum Frequencies. Experiments with visually impaired and control participants evaluate intelligibility and listener preference across uniform and protected WSOLA variants. Results show that RMS-based-protected WSOLA improves intelligibility, while equal word per minute comparisons reveal smaller perceptual differences. Findings highlight the importance of preserving transient information for accessible high-speed speech.

References

Choi, D., Kwak, D., Cho, M., Lee, S. 2020. “Nobody speaks that fast!” An empirical study of speech rate in conversational agents for people with vision impairments. CHI Conference on Human Factors in Computing Systems, 1–13.

Kapilow, D., Stylianou, Y., Schroeter, J. 1999. Detection of non-stationarity in speech signals and its application to time-scaling. Proc. 6th European Conference on Speech Communication and Technology, 2307-2310.

Pantalos, P. 2023. Exploration of non-stationary speech protection for highly intelligible time-scale compression (Master’s thesis). University of Crete, Greece.

Sfakianaki, A. 2021. Designing a Modern Greek sentence corpus for audiological and speech technology research. Proc. 14th International Conference on Greek Linguistics (ICGL14), 1119-1129. University of Patras, Greece.

Verhelst, W., Roelands, M. 1993. An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, 2, 554–557.

Downloads

Published

01-12-2025

Section

Proceedings Papers

How to Cite

Improving intelligibility of time-scale compressed speech for visually impaired and sighted listeners. (2025). Linguistic Proceedings Series, 61-64. https://doi.org/10.36505/TheLinguisticProceedings/2025/17/02/016/000702

Share

Similar Articles

1-10 of 224

You may also start an advanced similarity search for this article.