Piotr Żelasko comments

Results 523 comments of


                                            Piotr Żelasko

The processing efficiency and sampling rate problem of OPUS files

Hmm, I remember disabling it because I found the reverse to be true on some systems. I think the best way forward would be to expose the control over this...

The processing efficiency and sampling rate problem of OPUS files

Regarding 48kHz vs 16kHz, I'm not sure I got your point. OPUS is always decoded to 48kHz even if the original audio had smaller sampling rate, unless I missed something.

The processing efficiency and sampling rate problem of OPUS files

If the file has 16kHz, that makes sense. I just never encountered an OPUS file that actually has a sampling rate other than 48kHz, even when I encoded WAV data...

Wasting the memory of the train data

I assume you’re training ASR. You can optimize the use of GPU memory by using DynamicBucketingSampler and tweaking `max_duration` and `quadratic_duration` first, and to some extent also `num_buckets`.

Audio range out of (-1,+1)

Hi Karel! We had another issue related to this somewhere. Technically we could either add conditional rescaling (if np.max(np.abs(audio)) > 1.0, then divide audio by maxabs value) or a limiter...

Added handlings for negative end time (#1203)

@desh2608 Are we not adjusting alignment times together with supervision times when creating a cut? @ArthLeu could you add a unit test that would have failed before but will pass...

Updating lhotse caused some errors when reading data

You’re trying to read precomputed features from cuts that don’t have them. Are you sure you got the right cut set?

how much shared memory and disk memory do i need to process the S subset of wenetspeech dataset?

I’m guessing this is related to IPC of data loading workers for batch feat computation and could be related to too many workers/too large batches; but judging by the warning...

how much shared memory and disk memory do i need to process the S subset of wenetspeech dataset?

Try cuts = cuts.trim_to_supervisions() before feature extraction and then you can also use multiple workers again.

how much shared memory and disk memory do i need to process the S subset of wenetspeech dataset?

Yeah