Liger-Kernel icon indicating copy to clipboard operation
Liger-Kernel copied to clipboard

[feat] Add support for encoder-only transformers (e.g. BERT)

Open OxxoCodes opened this issue 1 year ago • 0 comments

🚀 The feature, motivation and pitch

Liger Kernel is currently incompatible with encoder-only transformer architectures such as BERT, DistilBERT, RoBERTa, XLM-R, and DeBERTa.

Given the importance these models still have in research and industry use-cases, it would be great to see support added to further decrease memory requirements and increase training throughput.

Alternatives

No response

Additional context

No response

OxxoCodes avatar Aug 27 '24 23:08 OxxoCodes