transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Support gradient checkpointing for ESM models

Open mahdip72 opened this issue 2 years ago • 1 comments

Would you please add gradient_checkpointing_enable() feature for ESM models? These models currently are the best available pre-trained protein language models for researchers. Many thanks.

mahdip72 avatar Jun 30 '23 18:06 mahdip72

cc @Rocketknight1

amyeroberts avatar Jun 30 '23 18:06 amyeroberts

Any updates?

mahdip72 avatar Aug 30 '23 03:08 mahdip72

It's on the to-do list, but I'm afraid there are competing priorities at the moment!

Rocketknight1 avatar Aug 30 '23 12:08 Rocketknight1

Let's open it up for anyone in the community who might want to tackle it :)

amyeroberts avatar Aug 30 '23 12:08 amyeroberts

Hi @amyeroberts @Rocketknight1 I would like to work on this

sanjeevk-os avatar Sep 05 '23 00:09 sanjeevk-os

@sanjeevk-os Great! Once you have the code ready, open a PR and ping both @Rocketknight1 and me. Looking forward to reviewing!

amyeroberts avatar Sep 05 '23 11:09 amyeroberts

Hi @sanjeevk-os, I actually took a look at the ESM code - it actually looks like some of the supports for gradient checkpointing are already there, in which case you just need to make a one-line change to set supports_gradient_checkpointing = True

Rocketknight1 avatar Sep 07 '23 11:09 Rocketknight1

Hi @Rocketknight1 Thank you for taking a look. I also noticed that the ESM model has the create_custom_forward passed to torch checkpoint function. I will do some more checks and will raise a PR soon.

sanjeevk-os avatar Sep 10 '23 11:09 sanjeevk-os

Hi @sanjeevk-os - we're getting even more requests for this, so we'd like to try to add it soon! If you're having trouble, just let us know. We can take over the PR internally to try to get it through, and we appreciate your effort regardless.

Rocketknight1 avatar Sep 22 '23 12:09 Rocketknight1

This issue has now been resolved - thank you to @sanjeevk-os for the very clean PR!

Rocketknight1 avatar Sep 26 '23 11:09 Rocketknight1