Logan Adams
Logan Adams
@Fhujinwu - can you test with the latest master branch and confirm the above PR fixes this issue?
@DZ9 and @vivek-media - can you test with the latest changes, since [this PR](https://github.com/microsoft/DeepSpeed/pull/3884) should fix the issue?
@DZ9 and @vivek-media - we're going to close this issue and move discussion to the other one. Can you try the latest build from source and confirm if that fix...
Hi @adammoody and @yuchen2580 - not all DeepSpeed ops currently build on DeepSpeed, though the MI200 CI test is broken, it does show some coverage. For those that AMD is...
@rraminen - curious what you ran that you came across this?
@shaowei-su and @stgzr - taking a look now
Hi @shaowei-su - a number of fixes have gone in, could you re-run and confirm that you still have this issue? Apologies for taking so long to get to this.
@shaowei-su or @stgzr - I don't have access to any P4 GPUs, do you know if this will repro with just the A100s?
Thanks! Apologies, I was confusing a fix for a different GH issue. But I guess its good to know if it reproduces with the latest DeepSpeed for debugging but seems...
@shaowei-su - have you tried reaching out to the accelerate folks on this since their data loader is what is throwing the error? Since I may not be able to...