optimum longT5 BetterTransformer implementation

longT5 BetterTransformer implementation

Open omri-sap opened this issue 2 years ago • 5 comments

Feature request

longT5 BetterTransformer implementation

Motivation

Encoder-decoder model trained on large context allow machine translation tasks

Your contribution

I looked at the implementation of regular T5 and it doesnt look to complex, i tried to implement myself but didnt succeed. If i can contribute please let me know.

Thank you, Omri

Nov 01 '23 18:11 omri-sap

seconding this! would be great

Nov 08 '23 21:11 pszemraj

Totally on board with this! Would love to see this feature added!

Nov 09 '23 12:11 matvey-kolbasov-hs

@fxmarty can we try tackle this together?

Thanks in advance

Nov 12 '23 11:11 omri-sap

Hi for reference we are upstreaming SDPA in Transformers, maybe it would be a better fit for longT5: https://github.com/huggingface/transformers/issues/28005

Leaving this open as we may leverage nested tensors for longt5 (which are not in Transformers).

Dec 13 '23 12:12 fxmarty

Hi @All. Is this still open or you guys will work on it?

Dec 19 '23 13:12 ENate

optimum optimum copied to clipboard

longT5 BetterTransformer implementation

Feature request

Motivation

Your contribution

optimum
optimum copied to clipboard