Training speech emotion classifier without categorical annotations

Speaker: Meysam Shamsi

Data and place: Nov 3, 2022, at 10:30

Abstract: Emotion recognition task can be treated as a classification using categorical labeling or regression modeling using dimensional description in continuous space. An investigation of the relation between these two representations will be presented, then a classification pipeline that uses only dimensional annotation will be proposed. This approach contains a regressor model which is trained to predict a vector of continuous values in dimensional representation for given speech audio. The output of this model can be interpreted as an emotional category using a mapping algorithm. We investigated the performances of a combination of three feature extractors, three neural network architectures, and three mapping algorithms on two different corpora. Our study shows the advantages and limitations of the classification via regression approach.