Emotional speech corpus construction, annotation and distribution

Charlie Cullen, Brian Vaughan

Research output: Contribution to conferencePaper

Abstract

Advances in both speech/emotion recognition and emotional speech synthesis largely depend on the availability of annotated, emotional speech corpora. Although it is common that corpora are purpose-built for specific applications or research purposes, it would be desirable to re-use existing corpora. However, there is a lack of widely accepted standards in such areas as audio quality, annotation with metadata in order to perform queries, as well as mutually agreed definitions, as in ‘what is emotion?’. The work described here is a developing process of emotional asset acquisition, annotation and on-line publishing for emotional rating by end users, which attempts to address some of the above issues, while being flexible in practical issues such as re-usability, standardisation and access. The paper is divided into three parts: (1) A method for obtaining “genuine” emotional speech recordings, namely Mood Induction Procedures (MIP 4), while recording in a controlled environment; (2) the analysis and annotation of the recorded assets via a purpose-built audio analysis tool and (3) an implementation of the IMDI corpus annotation schema.
Original languageEnglish
Publication statusPublished - 2008
EventThe 6th International Conference on Language Resources and Evaluation (LREC 2008) - Palais des Congrès Mansour Eddahbi Marrakech, Marrakech, Morocco
Duration: 26 May 20081 Jun 2008
http://www.lrec-conf.org/lrec2008/

Conference

ConferenceThe 6th International Conference on Language Resources and Evaluation (LREC 2008)
Abbreviated titleLREC 2008
Country/TerritoryMorocco
CityMarrakech
Period26/05/081/06/08
Internet address

Fingerprint

Dive into the research topics of 'Emotional speech corpus construction, annotation and distribution'. Together they form a unique fingerprint.

Cite this