OFA Performance of `SequenceGenerator` degrades considerably for larger HF-compatible models

Performance of `SequenceGenerator` degrades considerably for larger HF-compatible models

Open monatis opened this issue 2 years ago • 4 comments

SequenceGenerator ported from Fairseq to the add-transformers branch work well with the tiny variant, but its performance degrades considerably for larger models.

I tried newly added Colab notebook with other variants. Huge and base variants generate only "a" (yes, single letter), while the large model outputs only the single word "animal."

model.generate and original fairseq decoder with Fairseq-compatible checkpoint still produce ok-ish outputs.

I also experienced the same case with the checkpoint I shared in #171. Huge variant consistently outputs "a" with the ported SequenceGenerator while model.generate does a better job. It works well with the tiny variant, though.

Sep 15 '22 07:09 monatis

Thanks for your report! There might be some problems inside the generator, and let us further check it out. What you mean is that the HF native generator works well with variants, or did I misunderstand something?

Sep 15 '22 15:09 JustinLin610

Thanks @JustinLin610!

Yes, let me clarify it:

HF native generator works well with all the HF-compatible variants.
Original Fairseq decoder works well with all the Fairseq-compatible variants.
SequenceGenerator under transformers.models.OFA.generate ported from original Fairseq decoder works well with only the tiny variant. It generates simply "a" with the huge variant and single-word outputs with the large variant, for example.

Sep 15 '22 16:09 monatis

Got you! We'll check it out and give you a feedback as soon as possible.

Sep 16 '22 03:09 JustinLin610

Hi, I think it is because of the recent PRs for inverting the masks. I just tried your case, and it really output "a" (lol...), and after my update (also invert mask in fairseq generator), the result should be "one of the animals we saw".

Nov 04 '22 09:11 JustinLin610

OFA OFA copied to clipboard

Performance of `SequenceGenerator` degrades considerably for larger HF-compatible models

OFA
OFA copied to clipboard