audio-classification-pytorch
audio-classification-pytorch copied to clipboard
In this project, several approaches for training/finetuning an audio gender recognition is provided. The code can simply be used for any other audio classification task by simply changing the number o...
Audio Classification
In this project, several approaches for training/finetuning an audio gender recognition is provided. The code can simply be used for any other classification by changing the number of classes and the input dataset.
Dataset format
Dataset should be a csv file that has two columns: audio_path
and lable
.
audio_path label
0 /home/ai/projects/speech/dataset/asr/new-raw-0.wav female
1 /home/ai/projects/speech/dataset/asr/samples_1.wav male
2 /home/ai/projects/speech/dataset/asr/new-raw-2.wav female
3 /home/ai/projects/speech/dataset/asr/new-raw-3.wav male
4 /home/ai/projects/speech/dataset/asr/new-raw-4.wav female
Models
- LSTM_Model: uses mfccs to train a lstm model for audio classification. Trained using pytorchlightning.
- the idea of this structure is taken from LearnedVector repository which contains a wakeup model.
- transformer_scratch: Uses a transformer block for training an audio classification model with mfccs taken as inputs.
Trained using pytorchlightning.
- main implementation is taken from AnubhavGupta3377's repo called Text-Classification-Models-Pytorch
- It's modified to train audio samples.
- wav2vec2: Fine-tuning wav2vec2-base as an audio classification model using huggingface trainer.
Result on Gender Recognition
Trained and evaluated on a custom dataset. You can simply download common-voice dataset and use the samples.
Model | Train ACC | Val Acc | Train F1-score | Val-F1-score |
---|---|---|---|---|
LSTM | 89 | 90 | 90.83 | 91 |
Wav2vec2 | - | 96.4 | - | 96.4 |
transfomer | 85.1 | 81.7 | 87.1 | 84.6 |
references:
- https://github.com/LearnedVector/A-Hackers-AI-Voice-Assistant
- https://github.com/huggingface/transformers
- https://github.com/AnubhavGupta3377/Text-Classification-Models-Pytorch
- https://pytorch.org/tutorials/beginner/transformer_tutorial.html
- https://github.com/pooya-mohammadi/deep_utils