Speech Emotion Recognition using Unsupervised Feature Selection Algorithms
MetadataShow full item record
The use of the combination of different speech features is a common practice to improve the accuracy of Speech Emotion Recognition (SER). Sometimes, this leads to an abrupt increase in the processing time and some of these features contribute less to emotion recognition often resulting in an incorrect prediction of emotion with which the accuracy of the SER system decreases substantially. Hence, there is a need to select the appropriate feature set that can contribute significantly to emotion recognition. This paper presents the use of Feature Selection with Adaptive Structure Learning (FSASL) and Unsupervised Feature Selection with Ordinal Locality (UFSOL) algorithms for feature dimension reduction. A novel Subset Feature Selection (SuFS) algorithm is proposed to further reduce the feature dimension and achieve a comparable better accuracy when used along with the FSASL and UFSOL algorithms. 1582 INTERSPEECH 2010 Paralinguistic, 20 Gammatone Cepstral Coefficients and Support Vector Machine classifier with 10-Fold Cross-Validation and Hold-Out Validation are considered in this work. The EMO-DB and IEMOCAP databases are used to evaluate the performance of the proposed SER system in terms of Classification accuracy and Computational Time. From the result analysis, it is evident that the proposed SER system outperforms the existing ones.
KeywordsSpeech Emotion Recognition, INTERSPEECH Paralinguistic Feature Set, GTCC, feature selection, feature optimization, FSASL, UFSOL, SuFS
Document typePeer reviewed
Document versionFinal PDF
SourceRadioengineering. 2020 vol. 29, č. 2, s. 353-364. ISSN 1210-2512
- 2020/2