transformers icon indicating copy to clipboard operation
transformers copied to clipboard

VisionEncoderDecoderModel gradient checkpointing

Open metemadi opened this issue 1 year ago • 2 comments

Feature request

Would love to be able to use gradient checkpointing on VisionEncoderDecoder model.

model.gradient_checkpointing_enable() Traceback (most recent call last): File "", line 1, in File "/opt/conda/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1418, in gradient_checkpointing_enable raise ValueError(f"{self.class.name} does not support gradient checkpointing.") ValueError: VisionEncoderDecoderModel does not support gradient checkpointing.

Motivation

Gradient checkpointing always helps increase the accessibility of larger models - HuggingFace is awesome!!!

Your contribution

Happy to take a stab at this if someone can point me to a previous example of this working with an EncoderDecoder model.

metemadi avatar Aug 07 '22 17:08 metemadi

@NielsRogge, have you seen such examples? :)

LysandreJik avatar Aug 09 '22 08:08 LysandreJik

Here's a PR that added gradient checkpointing to T5: https://github.com/huggingface/transformers/pull/11353/files

NielsRogge avatar Aug 09 '22 11:08 NielsRogge

Fixed per #18697

NielsRogge avatar Aug 26 '22 12:08 NielsRogge