Analysis of prosodic correlates of emotional speech data

Katarina Bartkova; Denis Jouvet

doi:10.36505/ExLing-2018/09/0004/000337

Authors

Katarina Bartkova Université de Lorraine, CNRS, ATILF, F-54000 Nancy, France Author
Denis Jouvet Université de Lorraine, CNRS, Inria, LORIA, F-54000 Nancy, France Author

DOI:

https://doi.org/10.36505/ExLing-2018/09/0004/000337

Keywords:

expressive speech, emotions, prosodic groups, prosodic correlate

Abstract

The study of expressive speech styles remains an important topic regarding parameter detection or prediction in speech processing. In this paper, we analyze prosodic correlates for six emotional styles (anger, disgust, joy, fear, surprise, and sadness) using data uttered by two speakers. The analysis focuses on the way pronunciation and prosodic parameters are modified in emotional speech compared to a neutral style. The analysis concerns speech pronunciation modifications, the presence of pauses in sentences, and local prosodic behavior, with an emphasis on the analysis of prosody over prosodic groups and breathing groups.

References

Schröder, M. 2009. Expressive speech synthesis: Past, present, and possible futures. In Affective Information Processing, 111-126. London: Springer.

Lanjewar, R.B., & Chaudhari, D.S. 2013. Speech emotion recognition: A review. International Journal of Innovative Technology and Exploring Engineering (IJITEE), 2(4), 43–48.

Scherer, K.R. 2003. Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1-2), 227-256.

Iida, A., Campbell, N., Higuchi, F., & Yasumura, M. 2003. A corpus-based speech synthesis system with emotion. Speech Communication, 40(1-2), 161-187.

Yamagishi, J., Masuko, T., & Kobayashi, T. 2004. HMM-based expressive speech synthesis—Towards TTS with arbitrary speaking styles and emotions. In Proceedings of the Special Workshop in Maui. Maui, Hawaii.

Inanoglu, Z., & Young, S. 2009. Data-driven emotion conversion in spoken English. Speech Communication, 51(3), 268-283.

Tao, J., Kang, Y., & Li, A. 2006. Prosody conversion from neutral speech to emotional speech. IEEE Transactions on Audio, Speech, and Language Processing, 14(4), 1145-1154.

Bartkova, K., Jouvet, D., & Delais-Roussarie, E. 2016. Prosodic parameters and prosodic structures of French emotional data. In Proceedings of Speech Prosody 2016. Boston, United States.

Tahon, M., Qader, R., Lecorvé, G., & Lolive, D. 2016. Optimal feature set and minimal training size for pronunciation adaptation in TTS. In International Conference on Statistical Language and Speech Processing (SLSP 2016), 108-119. Springer, Cham.

Analysis of prosodic correlates of emotional speech data

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

Keywords

Browse Articles

Share