Alex McKinney
Alex McKinney
Thanks Tianyu, that is quite helpful. Do you also have any similar results for the RoBERTa experiments? I currently only have test scripts for this model (working on OPT, but...
Yes, to be precise: - I click "Compute Masks" - A segmentation model is run and masks stored as a hidden state - When changing the sliders or checkboxes the...
That's a shame, is there any way I could hack this in outside the intended API?
I would work from `main`. Things got very busy (has it really nearly been two years since this issue?) and I don't see myself returning to this repo in the...
Hi, has there been any update in this? It appears the above PR in :hugs: Transformers has been merged. It would be really useful in my own application for sub-second...
Ah thanks, it makes sense to filter here, but why did the code work when not using Multisteps? Afaik Multisteps also is also a gradient transformation, so the API should...
Hi, I also didn't have the resources to train the PixelSnail component at the time, so no file exists. I am actively working on a refactor of the repository with...
Looking at the paper, the distilled version was only trained on English data. I am interested in evaluating the model on Mandarin Chinese data once it is released, to see...
Fantastic~ Should we expect the speedup to be less for non-English audio on the English distilled model? Not familiar with the ins and outs of speculative decoding.