fastai_audio icon indicating copy to clipboard operation
fastai_audio copied to clipboard

Variable length WAV files

Open madhavajay opened this issue 6 years ago • 9 comments

This is really cool.

However I have tried it on my own data set and im getting the following errors:

> print(data)

[AudioClip (duration=4.040125s, sample_rate=16.0KHz), AudioClip (duration=4.030125s, sample_rate=16.0KHz), AudioClip (duration=4.030125s, sample_rate=16.0KHz), AudioClip (duration=4.040125s, sample_rate=16.0KHz), AudioClip (duration=4.030125s, sample_rate=16.0KHz)]...

> learn.fit_one_cycle(3)
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 1024])

When I look at the Magenta data looks to be all 4 second waves:

> print(data)

[AudioClip (duration=4.0s, sample_rate=16.0KHz), AudioClip (duration=4.0s, sample_rate=16.0KHz), AudioClip (duration=4.0s, sample_rate=16.0KHz), AudioClip (duration=4.0s, sample_rate=16.0KHz), AudioClip (duration=4.0s, sample_rate=16.0KHz)]...

I see this in the code, is it related?

# TODO: generalize this away from hard coding dim values
def pad_collate2d(batch):

If you can give me a general idea of what to look for I can see if i can fix it.

madhavajay avatar Jan 31 '19 07:01 madhavajay

Apologies for the missing documentation, but there is a flag to take care of this in AudioDataBunch: https://github.com/sevenfx/fastai_audio/blob/master/fastai_audio/data.py#L34

Just set equal_lengths to False and it should work. This uses fastai's SortishSampler to group similarly sized audio files into batches. If you're files are all very close to the same length however, you might want to add a transform to just crop/pad them all to the same length since that's faster. Also, make sure that your files are mono -- it doesn't yet support stereo audio.

Let me know if this doesn't fix your issue - I'm planning to take another stab at updating this code in the over the next couple of weeks.

jhartquist avatar Jan 31 '19 19:01 jhartquist

@sevenfx awesome I will take a look shortly. Theres some breaking changes in the latest versions of fastai so I just reverted to the version listed in the notebook you committed. I will let you know how I go with this.

madhavajay avatar Feb 05 '19 05:02 madhavajay

It works great! :)

madhavajay avatar Feb 13 '19 12:02 madhavajay

This is really cool

What i am suggesting . Why dont you make pull request to original fastai repo to make a part of it . Because this really cool work.

tahercoolguy avatar Feb 19 '19 07:02 tahercoolguy

I am new to deep learning . Can you please help how could i import this library to colab as i am using !pip install git+https://github.com/sevenfx/fastai_audio but this is giving me following errors

Command "python setup.py egg_info" failed with error code 1 in

tahercoolguy avatar Feb 20 '19 15:02 tahercoolguy

@tahercoolguy I dont think its been setup correctly to be installed via pip.

You can use some CLI commands to get the code like this:

# run once to download and extract fastai_audio
import os
source_url = "https://github.com/sevenfx/fastai_audio/archive/master.zip"
filename = os.path.basename(source_url)
# download zip from github
!wget -c $source_url
# unzip file
!unzip `pwd`/$filename
# move fastai_audio folder up to current directory
!mv `pwd`/fastai_audio-master/* `pwd`

# now you should be able to import fastai_audio
import fastai_audio

For the dependencies you could try:

!pip install fastai==1.0.43.post1 
!pip install librosa==0.6.2

or if you want the latest versions:

!pip install fastai
!pip install librosa

Good luck! :)

@sevenfx I agree this would be cool to merge into the main fastai repo? Should we ask Jeremy Howard his thoughts?

madhavajay avatar Feb 24 '19 04:02 madhavajay

Thank you @madhavajay for script . We would definitely ask about merging fastai_audio in fastai repo .

tahercoolguy avatar Feb 24 '19 07:02 tahercoolguy

@tahercoolguy @madhavajay Yeah, I haven't set things up to be installed by pip yet -- may look into that in the next couple of days.

Jeremy has actually tweeted that there is going to be an official fastai.audio coming soon (https://twitter.com/jeremyphoward/status/1093124046873518080) - this was meant to be more of an experimental branch, although I may continue dev on it when I have more free time

jhartquist avatar Feb 24 '19 08:02 jhartquist

Im not sure if its my dataset, but im getting an error: intercept_args() got an unexpected keyword argument 'val_bs'

my Databunch is:

batch_size = 64
data = (AudioItemList.from_folder(path)
                     .random_split_by_pct()
                     .label_from_folder()
                     .add_test_folder()
                     .databunch(bs=batch_size,tfms=tfms,equal_lengths=False))

and the full error is:

/usr/local/lib/python3.6/dist-packages/scipy/io/wavfile.py:273: WavFileWarning: Chunk (non-data) not understood, skipping it.
  WavFileWarning)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-101-6eb78fbe7b07> in <module>()
      4                      .label_from_folder()
      5                      .add_test_folder()
----> 6                      .databunch(bs=batch_size,tfms=tfms,equal_lengths=False))
      7 # data

1 frames
/content/fastai_audio/fastai_audio/data.py in create(cls, train_ds, valid_ds, test_ds, path, bs, equal_lengths, length_col, tfms, **kwargs)
     44             train_lengths = train_ds.lengths(length_col)
     45             train_sampler = SortishSampler(train_ds.x, key=lambda i: train_lengths[i], bs=bs//2)
---> 46             train_dl = DataLoader(train_ds, batch_size=bs, sampler=train_sampler, **kwargs)
     47 
     48             # precalculate lengths ahead of time if they aren't included in xtra

TypeError: intercept_args() got an unexpected keyword argument 'val_bs'

tbass134 avatar May 22 '19 20:05 tbass134