Piotr Żelasko comments

Results 523 comments of


                                            Piotr Żelasko

DDP address in use

I'm going to use @danpovey's solution rather than @csukuangfj's solution -- unfortunately, it is not straightforward to estimate how many utterances should be dropped in `partition_cut_ids`, since we have a...

DDP address in use

@danpovey @csukuangfj can you please try out the version in PR https://github.com/lhotse-speech/lhotse/pull/267 and let me know if it helped? I won't be able to test the snowfall distributed training setup...

DDP address in use

Merged!

DDP address in use

FYI this could be of interest to us https://huggingface.co/blog/accelerate-library

Building lexicons in Python

> @csukuangfj I'll talk to Kangwei about doing this, it would be a good first project that can let him understand the basics of k2 C++ programming. Cool! In that...

Building lexicons in Python

Could be a good opportunity to get more familiar with k2's C++ code. I'll start with the Python part and let's see then.

Building lexicons in Python

@jtrmal I won't find the time to work on it this week -- if you want to, feel free to start (just let me know if you do).

Full-librispeech training

Nice! I think there are two more easy wins: using tglarge for decoding (I think we’re using tgmed currently) and saving checkpoints more frequently than per epoch so we can...

Full-librispeech training

The data augmentation setup probably needs some tuning. I ran the full libri recipe as-is, and got: ``` Epoch 3: 2021-04-07 11:42:46,500 INFO [common.py:357] [test-clean] %WER 5.53% [2909 / 52576,...

Full-librispeech training

I think we already have implemented "both" sides padding; which would center the cuts. It'd look sth like this (except the cuts would be concatenated first, with data augmentation applied):...