jeffhataws
jeffhataws
This PR fixes the "RuntimeError: No CUDA GPUs are available" when running with --bf16 option on Neuron. Related PRs: https://github.com/huggingface/transformers/pull/20684 https://github.com/huggingface/transformers/pull/22300 # What does this PR do? While PR #22300...
This PR updates XLA ZeRO1 implementation to use [allgather coalesed](https://github.com/pytorch/xla/pull/5950) and [reduce-scatter coalesced](https://github.com/pytorch/xla/pull/5956).
## 🐛 Bug Currently ZeRO1 test/test_zero1.py is disabled for GPU since version 2.1 (https://github.com/pytorch/xla/pull/4912). We should reenable it for GPU to enable coverage for reduce-scatter/all-gather. When I tried with torch/xla...
## 🐛 Bug We use XLA_DISABLE_FUNCTIONALIZATION=1 in torch-xla 2.1 to workaround the trace slowdown issue (https://github.com/pytorch/xla/issues/6294). However, we are encountering a strange issue with the reproduction code in the next...
## 🐛 Bug When functionalization is on (XLA_DISABLE_FUNCTIONALIZATION=0), I see that there are fewer aliased tensors. Jack has a patch to increase the number of aliased tensors https://github.com/pytorch/xla/commit/e3fc03314dab5f44e3ed9ccbba6c15fbca3285cd . However,...