deeptrolldetector icon indicating copy to clipboard operation
deeptrolldetector copied to clipboard

Improve the raw data cleaning to get a bigger training set

Open andriosrobert opened this issue 6 years ago • 0 comments

As the inputs to the model are the spectrograms, we need to extract the dimensions of the audio files after parsing them to spectrograms. Once the raw data audio files have different sizes, they would generate different input sizes, which would not work for our model. For that reason, I've trimmed the audio files to 30 seconds clips.

To clean the audio files, I've used ffmpeg. First, we need to convert the raw_mp3 original files, to the wav format, which works better with our audio manipulation libs:

for i in *.mp3;
  do name=`echo $i | cut -d'.' -f1`;
  echo $name;
  ffmpeg -i "$i" -acodec pcm_u8 -ar 22050 "${name}.wav";
done

Then we trim the audio files to 30 seconds audio clips, in order to make the input size of our model fixed.

for i in *.wav;
  do name=`echo $i | cut -d'.' -f1`;
  echo $name;
  ffmpeg -t 30 -i "$i" 30-"$i";
done

Unfortunately, after converting and trimming, the 454 raw audio files dropped to 112 due to some kind of parsing problem. If you manage to increase this number, please, send your contribution so that we can get a better training set.

andriosrobert avatar Jun 23 '18 22:06 andriosrobert