stoke
stoke copied to clipboard
[Known Issue] Deepspeed multiple loss support
If running with multiple losses Deepseep will currently fail as retain_graph
is not passed to the backward
call within the Deepspeed engine thus preventing multiple backward calls within stoke
.
Please use a single loss function for the time being until this can be patched -- in most simple situations with multiple losses simply add them...
PR is currently opened to fix this: https://github.com/microsoft/DeepSpeed/pull/1149
We have been working with deepspeed and really need the retain_graph
support. (We can't simply add our losses.) Waiting for this to be fixed :)
@sualehasif
Heads up, I think you might have wanted to respond to the PR on the deepspeed repo instead of this one!