dcase_task2 icon indicating copy to clipboard operation
dcase_task2 copied to clipboard

Training General-Purpose Audio Tagging Networks with Noisy Labels and Iterative Self-Verification

Training General-Purpose Audio Tagging Networks with Noisy Labels and Iterative Self-Verification

This repository contains the corresponding code for the 2nd place submission to the first Freesound general-purpose audio tagging challenge carried out as Task 2 within the DCASE challenge 2018.

For a detailed description of the entire audio tagging system please visit the corresponding github page. In this README I just provide the technical instructions to set up the project.

Getting Started

Before we can start working with the code, we first need to set up a few things:

Setup and Requirements

Note: This package requires Python 2.7!

For a list of required python packages see the requirements.txt or just install them all at once using pip.

pip install -r requirements.txt

or the environment.yaml:

conda env create -f environment.yaml
conda activate dcase18

To install the project in develop mode run

python setup.py develop --user

in the root folder of the package.

This is what I recommend, especially if you want to try out new ideas.

Getting the Data

Then download the challenge data and organize it in the following folder structure:

<DATA_ROOT>
    - audio_train
    - audio_test
    - train.csv
    - test_post_competition.csv

Set Data and Model path

In config/settings.py you have to set the following two paths:

DATA_ROOT = "/home/matthias/shared/datasets/dcase2018_task2_release"
EXP_ROOT = "/home/matthias/experiments/dcase_task2/"

DATA_ROOT is the <DATA_ROOT> path from above.
EXP_ROOT is where the model parameters and logs will be stored.

Once this is all set up, you can switch to the detailed writeup on this github page.

Audio Tagger

In order to run the audio_tagger.py, we had to install pyaudio and portaudio in our Anaconda environment (Ubuntu 18.04):

conda install nwani::portaudio nwani::pyaudio