Piotr Żelasko comments

Results 523 comments of


                                            Piotr Żelasko

Issue using Lhotse in NeMo - RuntimeError: cannot join current thread

To fix the len issue you need to set trainer.max_steps to how long you want to train and limit_train_batches to some value (eg 1k steps)

Issue using Lhotse in NeMo - RuntimeError: cannot join current thread

Thanks for letting me know. This will help narrow it down. Once I figure out what’s caused the issue I’ll let you know here.

Issue using Lhotse in NeMo - RuntimeError: cannot join current thread

> You suggested that I should shard my dataset. Is this generally advisable to shard datasets for any training set-up, or specifically important because of the concurrent_bucketing setting being set...

Issue using Lhotse in NeMo - RuntimeError: cannot join current thread

> Do you know whether disabling concurrent_bucketing as you suggested above would have caused the below error? It happens approx. every 10000 steps (1 pseudo-epoch), so I have to restart...

Issue using Lhotse in NeMo - RuntimeError: cannot join current thread

There's not a lot of detail but if I had to guess, this could be CPU OOM. You can verify by monitoring it with some tool like htop or nmon....

Issue using Lhotse in NeMo - RuntimeError: cannot join current thread

Very surprising. Anyhow, glad you figured it out.

Issue using Lhotse in NeMo - RuntimeError: cannot join current thread

Is dynamic bucketing sampler used for validation set as well in the deadlocked run?

AttributeError: 'dict' object has no attribute 'to_dict'

Could you try with https://github.com/lhotse-speech/lhotse/pull/1355?

[Bug] The total number of supervisions decreased after trimming to supervision groups.

Makes sense... could you make a PR with the fix? Also, can you run `lhotse.validate()` on the input cut and see if it finds anything wrong with it?

OSError: [Errno 9] Unable to synchronously open file (unable to lock file, errno = 9, error message = 'Bad file descriptor')

Check if your path to features is still correct on the new machine.