OFA
OFA copied to clipboard
Performance of `SequenceGenerator` degrades considerably for larger HF-compatible models
SequenceGenerator
ported from Fairseq to the add-transformers
branch work well with the tiny variant, but its performance degrades considerably for larger models.
I tried newly added Colab notebook with other variants. Huge and base variants generate only "a" (yes, single letter), while the large model outputs only the single word "animal."
model.generate
and original fairseq decoder with Fairseq-compatible checkpoint still produce ok-ish outputs.
I also experienced the same case with the checkpoint I shared in #171. Huge variant consistently outputs "a" with the ported SequenceGenerator
while model.generate
does a better job. It works well with the tiny variant, though.
Thanks for your report! There might be some problems inside the generator, and let us further check it out. What you mean is that the HF native generator works well with variants, or did I misunderstand something?
Thanks @JustinLin610!
Yes, let me clarify it:
- HF native generator works well with all the HF-compatible variants.
- Original Fairseq decoder works well with all the Fairseq-compatible variants.
-
SequenceGenerator
undertransformers.models.OFA.generate
ported from original Fairseq decoder works well with only the tiny variant. It generates simply "a" with the huge variant and single-word outputs with the large variant, for example.
Got you! We'll check it out and give you a feedback as soon as possible.
Hi, I think it is because of the recent PRs for inverting the masks. I just tried your case, and it really output "a" (lol...), and after my update (also invert mask in fairseq generator), the result should be "one of the animals we saw".