Wonjoo Lee comments

Results 32 comments of


                                            Wonjoo Lee

Remove upsample_*.vec ops

PyTorch python op tests are failing: ``` ====================================================================== ERROR: test_upsamplingNearest2d_xla (__main__.TestNNDeviceTypeXLA) ---------------------------------------------------------------------- Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 391, in instantiated_test raise rte File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 378, in...

Remove upsample_*.vec ops

Ok, after removing the `upsample_nearest2d.vec` and updating the existing `GetOutputSizeWithScale` function to accept `scale_h` and `scale_w` as such: https://github.com/pytorch/xla/blob/1d0c3393fb48cb8740379e4fea9c37a0e131a7dd/torch_xla/csrc/aten_xla_type.cpp#L165-L175 I can confirm the related cpp tests are passing: `UpsampleNearest2D`, `UpsampleNearest2DWithScale`,...

How to minimize memory expansion due to padding during sharding

One guess is that while the total amount of HBM are equal between v3 and v4, the HBM bandwidth of v4 is higher than that of v3 (https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#tpu_v4) so it...

Foreach gradient clipping

I've opened opened https://github.com/pytorch/xla/pull/4480 (picking up @milesial's https://github.com/pytorch/xla/pull/4471). @milesial, can we update the XLA's commit pin (https://github.com/pytorch/pytorch/blob/master/.github/ci_commit_pins/xla.txt) on this PR to `eddaa4b3cf7c4c9302b6b04c6e5d13b4c6ba260b` and let the CI verify? Thanks!

Foreach gradient clipping

~So, it looks like the XLA's PR is treating the `RuntimeError` as failures so updated the https://github.com/pytorch/xla/pull/4480 to explicitly skip the `test_clip_grad_value_foreach_True_*` and `test_clip_grad_norm_foreach_True_*` tests.~ Oh, I just saw you...

Foreach gradient clipping

Looks like the CIs are green on both sides. Let's coordinate a merge tomorrow, thanks!

Foreach gradient clipping

> @wonjoolee95 CI passed, can you merge the XLA PR? @milesial, just merged to master. The new pin should be `eac4e547138ab22a9b41c6f96208613fd7dd19d5`.

Foreach gradient clipping

Okay since we can't force merge this right now, I'm going to revert the XLA's PR lol.

Foreach gradient clipping

You can update the XLA pin in this PR to `8dcab83819368f468dadbe6e81b064d268830df2` and `merge -g`. I'll merge the XLA's companion PR once this merges.

Kaggle notebook doesn't work

Thanks for reporting the issue. Pasting the error here for visibility: ``` --------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[8], line 4 1 generator = torch.Generator().manual_seed(0) 2 # xm.mark_step...