torch-audiomentations
torch-audiomentations copied to clipboard
cache background_noise rms data
Boost background_noise performance.
- Reduce audio decode and file io
- Reduce rms compute. maybe a diffrenece between rms(partial audio) and rms(full audio)
Hi fantasyRgg, and thanks for your PR 😃
Just for context, so I understand the problem you're proposing to solve, I want to ask some questions:
- How large is your background noise dataset?
- If you are training a model, how many workers do you use for preparing the audio examples that go into the training batches?
- How much memory (RAM) is there on the computer where you are doing the training?
- What audio file format are your background noise files? And do they have the same sample rate as the "clean" input audios that the noises get added to?
- Are you using an SSD or a HDD?
Ideally, a good solution would work well in all kinds of combinations of answers to those questions
-
How large is your background noise dataset?
About 2k records
-
If you are training a model, how many workers do you use for preparing the audio examples that go into the training batches?
Only one worker, I tried multi worker, not fast enough.
-
How much memory (RAM) is there on the computer where you are doing the training?
I cached samples and noises. samples took 7GB, noiese took 1.5GB
-
What audio file format are your background noise files? And do they have the same sample rate as the "clean" input audios that the noises get added to?
I don't think audio format and sample rate is problem.
audio: Audio
paramter will take care of all problem. -
Are you using an SSD or a HDD?
HDD
Thanks for the insight :) Indeed, in your case it makes sense to apply caching like this.
- [x] HDD
- [x] Not very large dataset - fits in RAM
- [x] Single worker
My own use case is quite different, and would actually be best without caching:
- [x] SSD
- [x] Very large dataset, cannot fit in RAM
- [x] Many workers
I don't think audio format and sample rate is problem. audio: Audio paramter will take care of all problem.
The reason why I asked is that resampling (in case of mismatch) may take a significant amount of CPU time, slowing down the model training.
I'm currently wrapping up the 0.11 release, and then I'll have some work preparing a few new transforms, and then after that I'll hopefully have more time to consider this caching feature. In the meantime, thanks for your patience, and I hope you're okay with using your own fork for now