torch-audiomentations
torch-audiomentations copied to clipboard
Pre-loading background noise (and IR) in __init__?
If one of the objective of torch-audiomentations
is to do fast augmentation on GPU as part of a larger network, reading from disk for each new audio sample would slow down the forward pass dramatically, wouldn't it?
https://github.com/asteroid-team/torch-audiomentations/blob/02752c9d6b7892c67d541af0c5279ec00f8216e3/torch_audiomentations/augmentations/background_noise.py#L47-L51
Would it make sense to allow the user to preload the whole background noise collection in memory during __init__
and then read from memory in apply_transform
?
Or, if the background noise collection is too big to fit in memory, we could load a bunch of them, use them for a while, and then load another (random) bunch of them, etc.
I've tried both, and it turns out that when I have a fast SSD and a swift wav loader, it doesn't really make much of a difference, as the bottleneck is usually elsewhere.
In my case, I had like 6 GB of IRs, which I didn't want to keep in memory, and I had a fast SSD. In a different scenario where the IR files are much fewer/smaller and I/O is a bottleneck, it does indeed make sense to cache these files.
I usually have 64GB or more of RAM but quite slow I/O, so yes, it makes complete sense.
Or, if the background noise collection is too big to fit in memory, we could load a bunch of them, use them for a while, and then load another (random) bunch of them, etc.
This is more complicated, because you don't know what the user can afford etc.. But still doable.
But the first option is definitely possible and a preload_wavs
flag or something like that could completely do the job.
Yes. Big RAM, slow I/O as well, here :)
But the first option is definitely possible and a preload_wavs flag or something like that could completely do the job.
also maybe caching as they are read.
Maybe better than loading everything at the beginning, yes. So we don't have to wait when we instantiate the class.