Piotr Żelasko
Piotr Żelasko
Possibly one of the manifest paths is corrupted or doesn’t exist?
It’s not expected. Can you share the manifest (or a part of it) to reproduce?
Thanks! I'd be fine with setting the default tolerance to 0.1 if you want to make that change as well.
@yunxinmengze can you reformat the PR with black and commit again?
Hi Yuekai, It would be nice to have a HF dataset adapter for Lhotse. We may call it `HFDatasetIterator`. Since HF datasets don't provide a common schema for every dataset,...
Resolved by https://github.com/lhotse-speech/lhotse/pull/1433
I suggest either moving to Lhotse shar format (see the tutorial in examples directory); or sharding your manifest into a lot of small chunks and using CutSet.from_files with random seed...
Not sure what's your question/issue.
You can use [`CutSet.save_audios()`](https://github.com/lhotse-speech/lhotse/blob/master/lhotse/cut/set.py#L2157) and call `.to_file()` on the returned result, i.e. ```python cuts = cuts.save_audios(audio_dir, ...) cuts.to_file(manifest_path) ``` If you want to avoid the copy, I think the best...
I can't see any obvious part of the code that would cause memory leaks. We are not storing the objects returned from the dataloader - the results are processed and...