Phonetic words duration simulation using Deep Neural Networks
DOI:
https://doi.org/10.36505/ExLing-2016/07/0037/000296Keywords:
deep NN, duration modeling, phonetic wordsAbstract
Deep Neural Networks (DNN) are widely used in speech prediction and speech modeling. The current paper describes the implementation of DNN for the task of duration prediction of speech units (allophones and syllables that form the structure of phonetic word, intonation phrase). It is well-known that numerous factors influence the duration of segments. However, the level of confidence of characteristics differs significantly. It was found that deep neural network that predicts allophones duration shows better results than the network that predicts the duration of syllables.
References
Lobanov, B.M, Tsirulnik, L.I., Rules of Speech Corpus Segmentation into Phonetic Units and the Strategy of Unit Selection in Speech Synthesis, http://www.dialog21.ru/digests/dialog2007/materials/html/60.htm
Lobanov, B.M, Tsirulnik, L.I., Computer Synthesis and Speech Cloning, Minsk, 2008. (in Russian)
Matusevich, M.I. 1976. Modern Russian Language. Phonetics. (in Russian) Sovremennij Russkij Yazik. Phonetika
Skrelin P., Kocharov D., Automatic processing of prosodic design of the utterance: relevant prosodic features for automatic interpretation of intonation model, 2009, AP-2009, Saint-Petersburg. (in Russian)
Skrelin, P., Volskaya, N., Kocharov, D., Glotova, O., Evdokimova V. CORPRES - Corpus of Russian professionally read speech. In Sojka, P., Horák A., Kopeček, I., Pala, K. (eds.), TSD 2010, LNCS, vol. 6231, pp. 392-399. Springer, Heidelberg (2010).
Svetozarova N.D., “Short” stressed vowels in the Russian language, Issues in Phonetics 6, 2014. (in Russian)
Downloads
Published
Issue
Section
License
Articles are published under the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.