Samuele Cornell
Samuele Cornell
Not really, but in https://github.com/DCASE-REPO/DESED_task/blob/c6bcb45b8b986ccde5c56bb86eefaf9d19b2320c/recipes/dcase2024_task4_baseline/local/sed_trainer_pretrained.py#L1457 we reconstruct long-form predictions from windowed predictions
What do you mean by audio cues ?
Then yeah
> For Japanese often the character error rate is reported, but I don't know if usually the tools are language aware or people prepare the transcript to use WER calculators....
Hey guys, yeah I plan to adjust it. But I am currently busy in the JSALT I thought I would have more time. I can do it this weekend though.
Hey Thilo, currently still busy for ICLR...
Hi, Thanks for the question, I think it has been done only to "upsample" the amount of synthetic training data during each epoch. It is very similar to having 12...
After many tries it seems to me that the best configuration is this one with the strong and synth concatenated. The strong labels do not seem to help in my...
> @popcornell > Could you please review this PR, particularly the template-related parts in decode/asr.py and run.py? > > I was considering two possible approaches for handling this: > >...
Yeah I think that will be the cleanest. To have templates for all basic tasks.