Speech Emotion Recognition Using Deep Learning

  • Unique Paper ID: 160727
  • Volume: 10
  • Issue: 1
  • PageNo: 1155-1161
  • Abstract:
  • Speech emotion recognition a space-growing analysis domain in recent years. Unlike humans, machines lack the skills to understand and show emotions, however, human-machine interactions are often improved by automatic emotion recognition, thereby reducing the necessity of human intervention. An SER system is a group of techniques for classifying and processing speech signals in order to find any embedded emotions. In this work, the RAVDEES database for speech emotion recognition is selected from Kaggle. The MFCC feature is extracted. Deep learning algorithm, CNN is used which classifies the extracted relevant MFCC features of speech signals which are used and recognizes the emotion. The speech emotion recognition system eases the identification of the speaker’s emotion and mental status. CNN model implemented in this work can recognize the emotional state of the speaker. The project achieved training accuracy of 96% and testing accuracy of 85%. This results in an accurate identification of the emotion.

Copyright & License

Copyright © 2025 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{160727,
        author = {Jennifer C Saldanha and Rohan Pinto},
        title = {Speech Emotion Recognition Using Deep Learning},
        journal = {International Journal of Innovative Research in Technology},
        year = {},
        volume = {10},
        number = {1},
        pages = {1155-1161},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=160727},
        abstract = {Speech emotion recognition a space-growing analysis domain in recent years. Unlike humans, machines lack the skills to understand and show emotions, however, human-machine interactions are often improved by automatic emotion recognition, thereby reducing the necessity of human intervention. An SER system is a group of techniques for classifying and processing speech signals in order to find any embedded emotions. In this work, the RAVDEES database for speech emotion recognition is selected from Kaggle. The MFCC feature is extracted. Deep learning algorithm, CNN is used which classifies the extracted relevant MFCC features of speech signals which are used and recognizes the emotion. The speech emotion recognition system eases the identification of the speaker’s emotion and mental status. CNN model implemented in this work can recognize the emotional state of the speaker. The project achieved training accuracy of 96% and testing accuracy of 85%. This results in an accurate identification of the emotion.},
        keywords = {Speech Emotion Recognition, Mel Frequency Cepstral Coefficients, Convolutional Neural Network, Long Short Time Memory, Deep Belief Network, Recurrent Neural Network},
        month = {},
        }

Cite This Article

  • ISSN: 2349-6002
  • Volume: 10
  • Issue: 1
  • PageNo: 1155-1161

Speech Emotion Recognition Using Deep Learning

Related Articles