Piotr Żelasko

Results 523 comments of Piotr Żelasko

@luomingshuang can you test this PR? It seems to work but I don't have the means to test more thoroughly at the moment https://github.com/lhotse-speech/lhotse/pull/683

I updated https://github.com/lhotse-speech/lhotse/pull/683 to handle this case too now.

We initially supported `reco2dur` but unfortunately it was not precise enough for the durations and we were running into issues with mismatched manifest metadata and audio that was loaded from...

If you can contribute the relevant option for `load_kaldi_data_dir` (disabled by default, enabled via argument/flag) in Lhotse I'd be happy to merge that PR.

My belief is that for large datasets we should never care about the number of epochs, and only care about the number of iterations. It makes the checkpointing and validation...

The current implementations are very lightweight, they basically write down how many cuts were iterated over, and the arguments of the sampler. During loading, the sampler iterates over the input...

Sorry for lack of responses from my side, I missed this thread somehow. And thanks again for the fix!

Would something like this work, assuming you have a VAD model in Python? Or are you looking for something different? ```python features = cut.load_features() mask = compute_vad_mask(vad_model, features) features =...

In terms of which VAD to apply, you can use e.g. SileroVAD: https://github.com/snakers4/silero-vad/wiki/Examples-and-Dependencies#examples Actually a workflow/integration into Lhotse would be nice if somebody is willing to contribute that.

I think the simplest way to get that is to write your own dataset class like this: ```python from lhotse.dataset.collation import collate_matrices class EmbeddingWithVadDataset(torch.utils.data.Dataset): def __init__(self, ...): self.vad = load_vad()...