Kairi Tamuri will defend her doctoral thesis titled „Basic emotions in read Estonian speech: acoustic analysis and modelling“ on Friday, 6. October 2017 at 14.15.
Professor Karl Pajusalu
Candidate of Philology Hille Pajupuu (Institute of the Estonian Language)
Professor Jean Léo Leonard (PhD, Université Paris – Sorbonne)
Professor Jaan Ross (Estonian academy of music and theatre)
Description of the problem
As far as synthetic speech has many applications in different fields, such as human-machine interaction, multimedia, or aids for the disabled, it is vital that the synthetic speech should sound natural, that is, as human-like as possible. One of the ways to naturalness lies through adding emotions to the synthetic speech by means of models feeding the synthesiser with combinations of acoustic parametric values necessary for emotional expression. In order to create such models of emotional speech, it is first necessary to have a detailed knowledge of the vocal expression of emotions in human speech. For that purpose I had to investigate to what extent, if any, and in what direction emotions influence the values of speech acoustic parameters (e.g., fundamental frequency, intensity and speech rate), and which parameters enable discrimination of emotions from each other and from neutral speech.
The present doctoral dissertation had two major purposes: (a) to find out and describe the acoustic expression of three basic emotions – joy, sadness and anger – in read Estonian speech, and (b) to create, based on the resulting description, acoustic models of emotional speech, designed to help parametric synthesis of Estonian speech recognizably express the above emotions.
Result and benefit
The results provided material for creating acoustic models of emotions (recorded examples of emotional speech synthesised using the test models can be accessed at https://www.eki.ee/heli/index.php?option=com_content&view=article&id=7&Itemid=494) to be presented to evaluators, who were asked to decide which of the models helped to produce synthetic speech with recognisable emotions. The experiment proved that with models based on acoustic results, an Estonian speech synthesiser can satisfactorily express sadness and anger, while joy was not so well recognised by listeners. This doctoral dissertation describes one of the possible ways for the vocal expression of joy, sadness and anger in Estonian speech and presents some models enabling addition of emotions to Estonian synthetic speech. The study serves as a starting point for the future development of acoustic models for Estonian emotional synthetic speech.