DL-Simplified icon indicating copy to clipboard operation
DL-Simplified copied to clipboard

Emotion Recognition from Audio using Deep Learning

Open ChethanaPotukanam opened this issue 1 year ago • 9 comments
trafficstars

Deep Learning Simplified Repository (Proposing new issue)

:red_circle: Project Title : Emotion Recognition from Audio using Deep Learning :red_circle: Aim : To build a deep learning model that can analyze audio recordings and classify the emotions expressed. This can have applications in areas such as customer service, mental health monitoring, and entertainment. :red_circle: Dataset : Various publicly available datasets for emotion recognition in audio, such as RAVDESS, TESS, CREMA-D, etc. :red_circle: Approach : Try to use 3-4 algorithms to implement the models and compare all the algorithms to find out the best fitted algorithm for the model by checking the accuracy scores. Also do not forget to do a exploratory data analysis before creating any model.


📍 Follow the Guidelines to Contribute in the Project :

  • You need to create a separate folder named as the Project Title.
  • Inside that folder, there will be four main components.
    • Images - To store the required images.
    • Dataset - To store the dataset or, information/source about the dataset.
    • Model - To store the machine learning model you've created using the dataset.
    • requirements.txt - This file will contain the required packages/libraries to run the project in other machines.
  • Inside the Model folder, the README.md file must be filled up properly, with proper visualizations and conclusions.

:red_circle::yellow_circle: Points to Note :

  • The issues will be assigned on a first come first serve basis, 1 Issue == 1 PR.
  • "Issue Title" and "PR Title should be the same. Include issue number along with it.
  • Follow Contributing Guidelines & Code of Conduct before start Contributing.

:white_check_mark: To be Mentioned while taking the issue :

  • Full name : Chethana Potukanam
  • GitHub Profile Link : https://github.com/ChethanaPotukanam
  • Email ID : [email protected]
  • Participant ID (if applicable):
  • Approach for this Project : Load the Dataset Exploratory Data Analysis (EDA): Visualise common patterns and features in audio signals. Feature Extraction: Extract features such as MFCC, Chroma, Mel Spectrogram, etc. Model Implementation: Convolutional Neural Network (CNN) , Recurrent Neural Network (RNN) , Long Short-Term , Memory (LSTM) , Bidirectional LSTM (BiLSTM) Train and Evaluate Each Model Compare Performance using accuracy and loss metrics.
  • What is your participant role? (Mention the Open Source program) GSSoC24

Happy Contributing 🚀

All the best. Enjoy your open source journey ahead. 😎

ChethanaPotukanam avatar Jul 06 '24 17:07 ChethanaPotukanam

Thank you for creating this issue! We'll look into it as soon as possible. Your contributions are highly appreciated! 😊

github-actions[bot] avatar Jul 06 '24 17:07 github-actions[bot]

Assigned @ChethanaPotukanam

abhisheks008 avatar Jul 07 '24 08:07 abhisheks008

can i work on this ?

Uknowme-h avatar Oct 02 '24 11:10 Uknowme-h

can i work on this ?

Please share your approach.

abhisheks008 avatar Oct 05 '24 04:10 abhisheks008

@abhisheks008 could you please assign me this issue?

my approach is as follows:-

using the ravdess dataset for emotional speech audio Feature Engineering : convert audio into Mel spectrogram format

  1. Using CNN to classify the audio according to emotions ( using VGG-16 architecture and ResNet50 as well)

  2. using CNN to extract features from the spectrogram and then apply LSTM / Bi-LSTM on the encodings.

  3. using HuggingFace speech2text and then using spacy universal sentence encoder to convert the resulting text into encoding vectors which can be predicted using an ANN.

this will be followed by evaluating the model using metrics and visualizing heatmaps of confusion matrices to analyse the error distribution

name : Moksh patel GitHub profile link : https://github.com/T3CH-Pyth0n event : Kharagpur Winter of Code ( KWoC)

T3CH-Pyth0n avatar Dec 16 '24 14:12 T3CH-Pyth0n

Hi @T3CH-Pyth0n sorry for replying late. Assigning this issue to you.

abhisheks008 avatar Dec 22 '24 14:12 abhisheks008

@abhisheks008 ill be altering the approach a bit, but I'll still implement 3-4 models. does that work?

T3CH-Pyth0n avatar Dec 23 '24 17:12 T3CH-Pyth0n

@abhisheks008 ill be altering the approach a bit, but I'll still implement 3-4 models. does that work?

Yes that'll work.

abhisheks008 avatar Dec 28 '24 13:12 abhisheks008

@abhisheks008 can i work on this?

Mishikasardana avatar Jul 21 '25 08:07 Mishikasardana