Jonatan Kłosko

Results 340 comments of Jonatan Kłosko

Hm yeah that wouldn't work. We could keep track of user sessions, but I think this is too brittle. Generally it's best if each scoretaker has an account. They can...

There is a link to competitor WCA profile and it should be enough at this point. In the long term we could figure out a some UI displaying current personal...

We could do more checks, but we need to find a balance. We definitely shouldn't introduce an additional back-and-forth to the server for this. Verifying against current regional records is...

I think we shouldn't show results on the competitors page, because then we effectively need to show all results from the competition. But I'm open to improving the page to...

@ArthurZucker thanks for the help! I think now the steps are to update the unknown token in multilangual checkpoints and add `tokenizer.json` to the repos. Let me know if there's...

@ArthurZucker sure! I've just created https://huggingface.co/openai/whisper-tiny/discussions/5, let me know if it looks as expected and I will open a matching PR on the other checkpoints too. FTR I generated the...

Changing the unknown token in configuration leads to a weird behaviour when loading the slow tokenizer, see an example in the PR. Any ideas why that is?

So the issue is that the multilingual tokenizer doesn't have `` in the initial vocabulary, so it would need to be added from special tokens map. However, when loading special...

To address this we would need to add `"": 50257` to `vocab.json` and remove it from `added_tokens.json`. Note that this is the case in the English checkpoints (except with 50256)....

Ah, so we should actually replace it, so that `` gets the id that currently `""` has, and we keep `""` just to make sure the ids are not shifted...