Analysis of prosodic correlates of emotional speech data
DOI:
https://doi.org/10.36505/ExLing-2018/09/0004/000337Keywords:
expressive speech, emotions, prosodic groups, prosodic correlateAbstract
The study of expressive speech styles remains an important topic regarding parameter detection or prediction in speech processing. In this paper, we analyze prosodic correlates for six emotional styles (anger, disgust, joy, fear, surprise, and sadness) using data uttered by two speakers. The analysis focuses on the way pronunciation and prosodic parameters are modified in emotional speech compared to a neutral style. The analysis concerns speech pronunciation modifications, the presence of pauses in sentences, and local prosodic behavior, with an emphasis on the analysis of prosody over prosodic groups and breathing groups.
References
Schröder, M. 2009. Expressive speech synthesis: Past, present, and possible futures. In Affective Information Processing, 111-126. London: Springer.
Lanjewar, R.B., & Chaudhari, D.S. 2013. Speech emotion recognition: A review. International Journal of Innovative Technology and Exploring Engineering (IJITEE), 2(4), 43–48.
Scherer, K.R. 2003. Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1-2), 227-256.
Iida, A., Campbell, N., Higuchi, F., & Yasumura, M. 2003. A corpus-based speech synthesis system with emotion. Speech Communication, 40(1-2), 161-187.
Yamagishi, J., Masuko, T., & Kobayashi, T. 2004. HMM-based expressive speech synthesis—Towards TTS with arbitrary speaking styles and emotions. In Proceedings of the Special Workshop in Maui. Maui, Hawaii.
Inanoglu, Z., & Young, S. 2009. Data-driven emotion conversion in spoken English. Speech Communication, 51(3), 268-283.
Tao, J., Kang, Y., & Li, A. 2006. Prosody conversion from neutral speech to emotional speech. IEEE Transactions on Audio, Speech, and Language Processing, 14(4), 1145-1154.
Bartkova, K., Jouvet, D., & Delais-Roussarie, E. 2016. Prosodic parameters and prosodic structures of French emotional data. In Proceedings of Speech Prosody 2016. Boston, United States.
Tahon, M., Qader, R., Lecorvé, G., & Lolive, D. 2016. Optimal feature set and minimal training size for pronunciation adaptation in TTS. In International Conference on Statistical Language and Speech Processing (SLSP 2016), 108-119. Springer, Cham.
Downloads
Published
Issue
Section
License
Articles are published under the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.