Manfei
Manfei
possible fix for this issue: ``` (Reading database ... 106380 files and directories currently installed.) Step #4 - "run_e2e_tests": Removing google-cloud-cli-gke-gcloud-auth-plugin (463.0.0-0) ... Step #4 - "run_e2e_tests": + apt-get -y...
# Description Please include a summary of relevant context/issue and your changes. # Tests Please describe the tests that you ran on TPUs to verify changes. **Instruction and/or command lines...
# Description Update hf-diffusers.libsonnet checkpoint steps to skip save checkpoints # Tests Please describe the tests that you ran on TPUs to verify changes. **Instruction and/or command lines to reproduce...
# Description Please include a summary of relevant context/issue and your changes. # Tests Please describe the tests that you ran on TPUs to verify changes. **Instruction and/or command lines...
Add fmax, fmin, frexp
## Fix the model test for `stable_diffusion_unet.py` 1. setup env according to [Run a model under torch_xla2](https://github.com/pytorch/xla/blob/master/experimental/torch_xla2/docs/support_a_new_model.md) 2. Run model test under `run_torchbench/` with `python models/your_target_model_name.py` 3. Fix the failure....
## Fix the model test for `yolov3.py` 1. setup env according to [Run a model under torch_xla2](https://github.com/pytorch/xla/blob/master/experimental/torch_xla2/docs/support_a_new_model.md) 2. Run model test under `run_torchbench/` with `python models/your_target_model_name.py` 3. Fix the failure....
## 🐛 Bug new built GPU docker image for PyTorch/XLA 2.5 with `r2.5` branch, passed `import torch_xla`, passed `PJRT_DEVICE=CPU python test/test_train_mp_mnist.py`, failed at mnist test with `PJRT_DEVICE=CUDA`: https://gist.github.com/ManfeiBai/f9efab9ce534970b7d9537006ff50a1a - `8...