Advances in both speech/emotion recognition and emotional speech synthesis largely depend on the availability of annotated, emotional speech corpora. Although it is common that corpora are purpose-built for specific applications or research purposes, it would be desirable to re-use existing corpora. However, there is a lack of widely accepted standards in such areas as audio quality, annotation with metadata in order to perform queries, as well as mutually agreed definitions, as in ‘what is emotion?’. The work described here is a developing process of emotional asset acquisition, annotation and on-line publishing for emotional rating by end users, which attempts to address some of the above issues, while being flexible in practical issues such as re-usability, standardisation and access. The paper is divided into three parts: (1) A method for obtaining “genuine” emotional speech recordings, namely Mood Induction Procedures (MIP 4), while recording in a controlled environment; (2) the analysis and annotation of the recorded assets via a purpose-built audio analysis tool and (3) an implementation of the IMDI corpus annotation schema.
|Publication status||Published - 2008|
|Event||The 6th International Conference on Language Resources and Evaluation (LREC 2008) - Palais des Congrès Mansour Eddahbi Marrakech, Marrakech, Morocco|
Duration: 26 May 2008 → 1 Jun 2008
|Conference||The 6th International Conference on Language Resources and Evaluation (LREC 2008)|
|Abbreviated title||LREC 2008|
|Period||26/05/08 → 1/06/08|