transformers
transformers copied to clipboard
[deepspeed] offload + non-cpuadam optimizer exception doc
part 2 of https://github.com/huggingface/transformers/pull/22043, but we can't merge it until deepspeed==0.8.3 is released.
This PR documents the new feature and up's the min deepspeed version.
XXX: DO NOT MERGE UNTIL deepspeed==0.8.3 is released.
I'm keeping it as a DRAFT so that I don't mistakenly merge it to soon. But we can pre-approve.
cc: @jeffra
The documentation is not available anymore as the PR was closed or merged.