This item is available under a Creative Commons License for non-commercial use only
The success of supervised learning approaches for the classification of emotion in speech depends highly on the quality of the training data. The manual annotation of emotion speech assets is the primary way of gathering training data for emotional speech recognition. This position paper proposes the use of crowdsourcing for the rating of emotion speech assets. Recent developments in learning from crowdsourcing offer opportunities to determine accurate ratings for assets which have been annotated by large numbers of non-expert individuals. The challenges involved include identifying good annotators, determining consensus ratings and learning the bias of annotators.
Tarasov, A., Cullen, C. & Delany, S. (2010) Using Crowdsourcing for labeling emotional speech assets. W3C workshop on Emotion ML, Paris, France, 5-6 October. doi:10.21427/D7RS4G