bert4torch icon indicating copy to clipboard operation
bert4torch copied to clipboard

关于gradient-checkpointing的支持

Open boxiaowave opened this issue 1 year ago • 1 comments

你好!

非常感谢作者编写的这套torch框架,gradient-checkpointing是种可以节省显存的训练方法,对于资源紧张下训练大模型有比较大的帮助作用,在苏神的博客上也有介绍,huggingface的transformers也内置了相关支持,是否能在后期加上这个功能?

boxiaowave avatar Oct 18 '22 01:10 boxiaowave

👌,我这边先评估下哈

Tongjilibo avatar Oct 18 '22 03:10 Tongjilibo

已更新,调用方式如下,烦请试用

self.bert = build_transformer_model(config_path=config_path, checkpoint_path=checkpoint_path, with_pool=True, gradient_checkpoint=True)

Tongjilibo avatar Oct 23 '22 15:10 Tongjilibo

大佬牛逼,太敬业了

boxiaowave avatar Oct 23 '22 15:10 boxiaowave