Modelling prosodic structure using Artificial Neural Networks
DOI:
https://doi.org/10.36505/ExLing-2017/08/0005/000307Keywords:
Cypriot Greek, statements, questions, convolutional networks, LSTMsAbstract
The ability to accurately perceive whether a speaker is asking a question or is making a statement is crucial for any successful interaction. However, learning and classifying tonal patterns has been a challenging task for automatic speech recognition and for models of tonal representation, as tonal contours are characterized by significant variation. This paper provides a classification model of Cypriot Greek questions and statements. We evaluate two state-of-the-art network architectures: a Long Short-Term Memory (LSTM) network and a convolutional network (ConvNet). The ConvNet outperforms the LSTM in the classification task and exhibited an excellent performance with 95% classification accuracy.
References
Ciresan, D., Meier, U., Schmidhuber, J. 2012. Multi-column deep neural networks for image classification. IEEE Conference on Computer Vision and Pattern Recognition. New York, NY: Institute of Electrical and Electronics Engineers (IEEE): 3642–3649.
Felix A. Gers, Jürgen Schmidhuber, Cummins Fred. 2000. Learning to Forget: Continual Prediction with LSTM. Neural Computation. 12 (10): 2451–2471.
Graves, A., Mohamed, Abdel-rahman, Hinton, G. 2013. Speech Recognition with Deep Recurrent Neural Networks. Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on: 6645–6649.
Gers, F. Schraudolph, N., Schmidhuber, J. 2002. Learning precise timing with LSTM recurrent networks. Journal of Machine Learning Research. 3: 115–143.
Hochreiter, S., Schmidhuber, J. 1997. Long short-term memory. Neural Computation, 9(8):1735–1780.
Kalchbrenner, N., Grefenstette, E., Blunsom, Ph. 2014. A Convolutional Neural Network for Modelling Sentences. arXiv:1404.2188
Schmidhuber, J., Wierstra D., Gomez F.J. 2005. Evolino: Hybrid Neuroevolution / Optimal Linear Search for Sequence Learning. Proc. of the 19th International Joint Conference on Artificial Intelligence (IJCAI). Edinburgh, 853–858.
Themistocleous Charalambos 2011. Prosody and Information Structure in Greek (in Greek). PhD Thesis. University of Athens Greece.
Themistocleous, Charalambos 2016. Seeking an anchorage: Evidence from the tonal alignment of the Cypriot Greek prenuclear pitch accent. Language and Speech., 59(4). 433-461, doi: 10.1177/0023830915614602.
Downloads
Published
Issue
Section
License
Articles are published under the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.