transformers
transformers copied to clipboard
VisionEncoderDecoderModel gradient checkpointing
Feature request
Would love to be able to use gradient checkpointing on VisionEncoderDecoder model.
model.gradient_checkpointing_enable() Traceback (most recent call last): File "
", line 1, in File "/opt/conda/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1418, in gradient_checkpointing_enable raise ValueError(f"{self.class.name} does not support gradient checkpointing.") ValueError: VisionEncoderDecoderModel does not support gradient checkpointing.
Motivation
Gradient checkpointing always helps increase the accessibility of larger models - HuggingFace is awesome!!!
Your contribution
Happy to take a stab at this if someone can point me to a previous example of this working with an EncoderDecoder model.
@NielsRogge, have you seen such examples? :)
Here's a PR that added gradient checkpointing to T5: https://github.com/huggingface/transformers/pull/11353/files
Fixed per #18697