Piotr Żelasko
Piotr Żelasko
I merged it, please check out the latest Lhotse and try again: ``` pip uninstall lhotse pip install git+https://github.com/lhotse-speech/lhotse ```
OK but this time it is different. You have a 5.3s cut with two supervisions - one of them spans the whole cut, the other one is much longer -...
That makes sense, will do.
+1 to what both of you are saying. These scripts are also on my radar, but I'm refraining myself from touching that as I'd prefer to contribute to a working...
BTW I intend to finish that PR so that we can switch between single-GPU and multi-GPU training when I find some spare time. I will also need to make sure...
I am pretty sure that the disambiguation symbols in the phone symbol table are also present in the acoustic model output layer, as my changes didn't address that yet. I...
Wait, does G really contain disambiguation symbols? I thought they only exist in the L alphabet (as "phones") and not in the G alphabet (as "words").
Thanks! Never too late to learn.
I was testing a bunch of speech synthesis and vocoder models, and found the following operators missing so far: - `aten::flip` - `aten::equal` - `aten::upsample_nearest1d.out`
Good insight! I was able to validate that you're right by replacing noise generation like this: ```python INT16MAX = 32768 noise = torch.randint(-INT16MAX, INT16MAX - 1, (1, 32000)) noise =...