fastai_audio
fastai_audio copied to clipboard
Variable length WAV files
This is really cool.
However I have tried it on my own data set and im getting the following errors:
> print(data)
[AudioClip (duration=4.040125s, sample_rate=16.0KHz), AudioClip (duration=4.030125s, sample_rate=16.0KHz), AudioClip (duration=4.030125s, sample_rate=16.0KHz), AudioClip (duration=4.040125s, sample_rate=16.0KHz), AudioClip (duration=4.030125s, sample_rate=16.0KHz)]...
> learn.fit_one_cycle(3)
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 1024])
When I look at the Magenta data looks to be all 4 second waves:
> print(data)
[AudioClip (duration=4.0s, sample_rate=16.0KHz), AudioClip (duration=4.0s, sample_rate=16.0KHz), AudioClip (duration=4.0s, sample_rate=16.0KHz), AudioClip (duration=4.0s, sample_rate=16.0KHz), AudioClip (duration=4.0s, sample_rate=16.0KHz)]...
I see this in the code, is it related?
# TODO: generalize this away from hard coding dim values
def pad_collate2d(batch):
If you can give me a general idea of what to look for I can see if i can fix it.
Apologies for the missing documentation, but there is a flag to take care of this in AudioDataBunch
: https://github.com/sevenfx/fastai_audio/blob/master/fastai_audio/data.py#L34
Just set equal_lengths
to False
and it should work. This uses fastai's SortishSampler
to group similarly sized audio files into batches. If you're files are all very close to the same length however, you might want to add a transform to just crop/pad them all to the same length since that's faster. Also, make sure that your files are mono -- it doesn't yet support stereo audio.
Let me know if this doesn't fix your issue - I'm planning to take another stab at updating this code in the over the next couple of weeks.
@sevenfx awesome I will take a look shortly. Theres some breaking changes in the latest versions of fastai so I just reverted to the version listed in the notebook you committed. I will let you know how I go with this.
It works great! :)
This is really cool
What i am suggesting . Why dont you make pull request to original fastai repo to make a part of it . Because this really cool work.
I am new to deep learning . Can you please help how could i import this library to colab as i am using !pip install git+https://github.com/sevenfx/fastai_audio but this is giving me following errors
Command "python setup.py egg_info" failed with error code 1 in
@tahercoolguy I dont think its been setup correctly to be installed via pip.
You can use some CLI commands to get the code like this:
# run once to download and extract fastai_audio
import os
source_url = "https://github.com/sevenfx/fastai_audio/archive/master.zip"
filename = os.path.basename(source_url)
# download zip from github
!wget -c $source_url
# unzip file
!unzip `pwd`/$filename
# move fastai_audio folder up to current directory
!mv `pwd`/fastai_audio-master/* `pwd`
# now you should be able to import fastai_audio
import fastai_audio
For the dependencies you could try:
!pip install fastai==1.0.43.post1
!pip install librosa==0.6.2
or if you want the latest versions:
!pip install fastai
!pip install librosa
Good luck! :)
@sevenfx I agree this would be cool to merge into the main fastai repo? Should we ask Jeremy Howard his thoughts?
Thank you @madhavajay for script . We would definitely ask about merging fastai_audio in fastai repo .
@tahercoolguy @madhavajay Yeah, I haven't set things up to be installed by pip yet -- may look into that in the next couple of days.
Jeremy has actually tweeted that there is going to be an official fastai.audio
coming soon (https://twitter.com/jeremyphoward/status/1093124046873518080) - this was meant to be more of an experimental branch, although I may continue dev on it when I have more free time
Im not sure if its my dataset, but im getting an error: intercept_args() got an unexpected keyword argument 'val_bs'
my Databunch is:
batch_size = 64
data = (AudioItemList.from_folder(path)
.random_split_by_pct()
.label_from_folder()
.add_test_folder()
.databunch(bs=batch_size,tfms=tfms,equal_lengths=False))
and the full error is:
/usr/local/lib/python3.6/dist-packages/scipy/io/wavfile.py:273: WavFileWarning: Chunk (non-data) not understood, skipping it.
WavFileWarning)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-101-6eb78fbe7b07> in <module>()
4 .label_from_folder()
5 .add_test_folder()
----> 6 .databunch(bs=batch_size,tfms=tfms,equal_lengths=False))
7 # data
1 frames
/content/fastai_audio/fastai_audio/data.py in create(cls, train_ds, valid_ds, test_ds, path, bs, equal_lengths, length_col, tfms, **kwargs)
44 train_lengths = train_ds.lengths(length_col)
45 train_sampler = SortishSampler(train_ds.x, key=lambda i: train_lengths[i], bs=bs//2)
---> 46 train_dl = DataLoader(train_ds, batch_size=bs, sampler=train_sampler, **kwargs)
47
48 # precalculate lengths ahead of time if they aren't included in xtra
TypeError: intercept_args() got an unexpected keyword argument 'val_bs'