Samuele Cornell comments

Results 70 comments of


                                            Samuele Cornell

Test baseline on audio stream

Not really, but in https://github.com/DCASE-REPO/DESED_task/blob/c6bcb45b8b986ccde5c56bb86eefaf9d19b2320c/recipes/dcase2024_task4_baseline/local/sed_trainer_pretrained.py#L1457 we reconstruct long-form predictions from windowed predictions

Test baseline on audio stream

What do you mean by audio cues ?

Test baseline on audio stream

Then yeah

Added pseudo alignment strategy based on phoneme duration

> For Japanese often the character error rate is reported, but I don't know if usually the tools are language aware or people prepare the transcript to use WER calculators....

Added pseudo alignment strategy based on phoneme duration

Hey guys, yeah I plan to adjust it. But I am currently busy in the JSALT I thought I would have more time. I can do it this weekend though.

Added pseudo alignment strategy based on phoneme duration

Hey Thilo, currently still busy for ICLR...

'synth_set' used twice

Hi, Thanks for the question, I think it has been done only to "upsample" the amount of synthetic training data during each epoch. It is very similar to having 12...

'synth_set' used twice

After many tries it seems to me that the best configuration is this one with the strong and synth concatenated. The strong labels do not seem to help in my...

[espnet3-9] Add Librispeech-100h ASR recipe

> @popcornell > Could you please review this PR, particularly the template-related parts in decode/asr.py and run.py? > > I was considering two possible approaches for handling this: > >...

[espnet3-9] Add Librispeech-100h ASR recipe

Yeah I think that will be the cleanest. To have templates for all basic tasks.