Piotr Żelasko

Results 523 comments of Piotr Żelasko

Interesting. How many GPUs? Can you also try increasing the buffer size to 50k? Otherwise maybe the batch duration is too low to notice a difference. I observed a 10%...

Aside from that, your max duration seems low for A100. Try adding quadratic_duration=15 to the sampler and you’ll probably be able to increase max duration by 100-200 (but I’d expect...

Mmm it seems you are using webdataset or Lhotse shar, so when batch size or buffer size grows, the initialization of dataloader (on first step of iteration) takes longer as...

Just pushed a version that is better tested and supports both map-style and iterable-style datasets.

I've tested this change more thoroughly and I'm now confident it helps with the training speed. When training NeMo FastConformer RNNT+CTC ASR on a ~20k hours dataset with 16 GPUs...

Also, could you add an entry in the dataset table in docs/corpus.rst?

It seems the tests are failing on importing `num2words`, can you make it into a local import guarded by `is_module_available` (pls search lhotse sources for `is_module_available` to see an example...

It will be more robust if you split your manifest into parts and process each part separately. You can launch the script multiple times with GPU ID as an argument.

You'd define a transform class for that and add the relevant methods to recording/cut. You can see this PR for an end-to-end example: https://github.com/lhotse-speech/lhotse/pull/382/files#diff-add451896faa625c1820580ab6ad64bef75e2886d551efc0f5705100ea62b28a These transforms are intended mostly for...

There are two methods `split` and `split_lazy` defined on each manifest type https://github.com/lhotse-speech/lhotse/blob/4f014b13202c724d484e0471343053a261487b8a/lhotse/cut/set.py#L821-L882 Also accessible from CLI: https://github.com/lhotse-speech/lhotse/blob/4f014b13202c724d484e0471343053a261487b8a/lhotse/bin/modes/manipulation.py#L130-L215