transformers Support gradient checkpointing for ESM models

Support gradient checkpointing for ESM models

Open mahdip72 opened this issue 2 years ago • 1 comments

Would you please add gradient_checkpointing_enable() feature for ESM models? These models currently are the best available pre-trained protein language models for researchers. Many thanks.

Jun 30 '23 18:06 mahdip72

cc @Rocketknight1

Jun 30 '23 18:06 amyeroberts

Any updates?

Aug 30 '23 03:08 mahdip72

It's on the to-do list, but I'm afraid there are competing priorities at the moment!

Aug 30 '23 12:08 Rocketknight1

Let's open it up for anyone in the community who might want to tackle it :)

Aug 30 '23 12:08 amyeroberts

Hi @amyeroberts @Rocketknight1 I would like to work on this

Sep 05 '23 00:09 sanjeevk-os

@sanjeevk-os Great! Once you have the code ready, open a PR and ping both @Rocketknight1 and me. Looking forward to reviewing!

Sep 05 '23 11:09 amyeroberts

Hi @sanjeevk-os, I actually took a look at the ESM code - it actually looks like some of the supports for gradient checkpointing are already there, in which case you just need to make a one-line change to set supports_gradient_checkpointing = True

Sep 07 '23 11:09 Rocketknight1

Hi @Rocketknight1 Thank you for taking a look. I also noticed that the ESM model has the create_custom_forward passed to torch checkpoint function. I will do some more checks and will raise a PR soon.

Sep 10 '23 11:09 sanjeevk-os

Hi @sanjeevk-os - we're getting even more requests for this, so we'd like to try to add it soon! If you're having trouble, just let us know. We can take over the PR internally to try to get it through, and we appreciate your effort regardless.

Sep 22 '23 12:09 Rocketknight1

This issue has now been resolved - thank you to @sanjeevk-os for the very clean PR!

Sep 26 '23 11:09 Rocketknight1

transformers transformers copied to clipboard

Support gradient checkpointing for ESM models

transformers
transformers copied to clipboard