DL-Simplified Emotion Recognition from Audio using Deep Learning

trafficstars

Deep Learning Simplified Repository (Proposing new issue)

:red_circle: Project Title : Emotion Recognition from Audio using Deep Learning :red_circle: Aim : To build a deep learning model that can analyze audio recordings and classify the emotions expressed. This can have applications in areas such as customer service, mental health monitoring, and entertainment. :red_circle: Dataset : Various publicly available datasets for emotion recognition in audio, such as RAVDESS, TESS, CREMA-D, etc. :red_circle: Approach : Try to use 3-4 algorithms to implement the models and compare all the algorithms to find out the best fitted algorithm for the model by checking the accuracy scores. Also do not forget to do a exploratory data analysis before creating any model.

📍 Follow the Guidelines to Contribute in the Project :

You need to create a separate folder named as the Project Title.
Inside that folder, there will be four main components.
- Images - To store the required images.
- Dataset - To store the dataset or, information/source about the dataset.
- Model - To store the machine learning model you've created using the dataset.
- requirements.txt - This file will contain the required packages/libraries to run the project in other machines.
Inside the Model folder, the README.md file must be filled up properly, with proper visualizations and conclusions.

:red_circle::yellow_circle: Points to Note :

The issues will be assigned on a first come first serve basis, 1 Issue == 1 PR.
"Issue Title" and "PR Title should be the same. Include issue number along with it.
Follow Contributing Guidelines & Code of Conduct before start Contributing.

:white_check_mark: To be Mentioned while taking the issue :

Full name : Chethana Potukanam
GitHub Profile Link : https://github.com/ChethanaPotukanam
Email ID : [email protected]
Participant ID (if applicable):
Approach for this Project : Load the Dataset Exploratory Data Analysis (EDA): Visualise common patterns and features in audio signals. Feature Extraction: Extract features such as MFCC, Chroma, Mel Spectrogram, etc. Model Implementation: Convolutional Neural Network (CNN) , Recurrent Neural Network (RNN) , Long Short-Term , Memory (LSTM) , Bidirectional LSTM (BiLSTM) Train and Evaluate Each Model Compare Performance using accuracy and loss metrics.
What is your participant role? (Mention the Open Source program) GSSoC24

Happy Contributing 🚀

All the best. Enjoy your open source journey ahead. 😎

Jul 06 '24 17:07 ChethanaPotukanam

Thank you for creating this issue! We'll look into it as soon as possible. Your contributions are highly appreciated! 😊

Jul 06 '24 17:07 github-actions[bot]

Assigned @ChethanaPotukanam

Jul 07 '24 08:07 abhisheks008

can i work on this ?

Oct 02 '24 11:10 Uknowme-h

can i work on this ?

Please share your approach.

Oct 05 '24 04:10 abhisheks008

@abhisheks008 could you please assign me this issue?

my approach is as follows:-

using the ravdess dataset for emotional speech audio Feature Engineering : convert audio into Mel spectrogram format

Using CNN to classify the audio according to emotions ( using VGG-16 architecture and ResNet50 as well)
using CNN to extract features from the spectrogram and then apply LSTM / Bi-LSTM on the encodings.
using HuggingFace speech2text and then using spacy universal sentence encoder to convert the resulting text into encoding vectors which can be predicted using an ANN.

this will be followed by evaluating the model using metrics and visualizing heatmaps of confusion matrices to analyse the error distribution

name : Moksh patel GitHub profile link : https://github.com/T3CH-Pyth0n event : Kharagpur Winter of Code ( KWoC)

Dec 16 '24 14:12 T3CH-Pyth0n

Hi @T3CH-Pyth0n sorry for replying late. Assigning this issue to you.

Dec 22 '24 14:12 abhisheks008

@abhisheks008 ill be altering the approach a bit, but I'll still implement 3-4 models. does that work?

Dec 23 '24 17:12 T3CH-Pyth0n

@abhisheks008 ill be altering the approach a bit, but I'll still implement 3-4 models. does that work?

Yes that'll work.

Dec 28 '24 13:12 abhisheks008

@abhisheks008 can i work on this?

Jul 21 '25 08:07 Mishikasardana

DL-Simplified DL-Simplified copied to clipboard

Emotion Recognition from Audio using Deep Learning

Deep Learning Simplified Repository (Proposing new issue)

📍 Follow the Guidelines to Contribute in the Project :

DL-Simplified
DL-Simplified copied to clipboard