Piotr Żelasko
Piotr Żelasko
Could you take a look at the existing recipes in `lhotse/recipes` and see if you can use any of them as a basis to write your own? If you'd be...
the best option is to split and import both types of files separately
Yes. As long as Lhotse knows the actual sampling rate it will work. Typically you’d want to resample on the fly or before feature extraction with cuts.resample(16000).
Please try #1318 and LMK if that helps.
could you find out which files these are and share their urls here? Would be great if we can fix the recipe properly
I am OK with your proposed solution, could you make a PR?
Can you show an example of what's wrong with the duration, and provide your versions of lhotse and torch/torchaudio? Also can you run and show the output of: ```python import...
That last error is a mistake on my side; it should be `audio, sr = torchaudio.load(...)`. I checked locally with an m4a file: it looks like both sox and libsoundfile...
It seems it's an issue with torchaudio + m4a support, I made a PR with a workaround (https://github.com/lhotse-speech/lhotse/pull/1124), please try it out and see if it helps. BTW you might...
Thanks, it looks good! As a minor technical note, I don't think you need any of `//` (and actually even ``) for CTC models. Regarding `len(sampler)`, we did support it...