Ronghang Hu

http://ronghanghu.com/ [email protected]

Meta AI Menlo Park, CA Research Scientist, Meta AI

Results 46 comments of


Ronghang Hu

Autograd discrepancy in `nn.Linear` (`torch.nn.functional.linear`) between native PyTorch and PyTorch/XLA

A follow-up on this: the underlying cause here seems to be that autograd treats `.t()` + `.mm()` differently between native PyTorch and PyTorch/XLA. ```python3 import os os.environ["XLA_IR_DEBUG"] = "1" import...

Autograd discrepancy in `nn.Linear` (`torch.nn.functional.linear`) between native PyTorch and PyTorch/XLA

> Thanks ronghang, I will try to take a look soon. Thanks, Jack! For now, we can patch this discrepancy via `torch.nn.functional.linear = xla_patched_linear` in https://gist.github.com/ronghanghu/d82ede74c434f8c12ae3ffb65ec84b45, so it's not blocking...

Fold not loading after custom compilation

> _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE Looks like `tensorflow::strings::StrCat` is missing. This error seems to come from the string interface change in GCC from GCC 4.x to 5.x. https://developers.redhat.com/blog/2015/02/05/gcc5-and-the-c11-abi/

Is it possible to do conditional branching during evaluation?

A tricky workaround I'm using is to split the forward and backward into multiple `sess.run`, and glue them together with some auxiliary variables. For example, the first `sess.run` can store...

Is it possible to do conditional branching during evaluation?

@MycChiu Yes, that's was a big headache for me. My tricky implementation involves anther forward pass when calculating the gradients, and to make it work I had to remove all...

Is it possible to do conditional branching during evaluation?

Also, `sess.partial_run` may be a solution, but somehow I couldn't get it work.

Is it possible to do conditional branching during evaluation?

My model involves first generating a parsing tree from a neural parser (implemented in TensorFlow) and a TreeLSTM running over the previous parsing tree (also implemented in TensorFlow, using Fold)....

XLA profiler issues: 1) TPU device trace does not show trace annotations, and 2) PRJT only captures 2 cores

@JackCaoG Thanks for looking into this! > A quick workaround is to also set XLA_IR_DEBUG=1 during profiling. Earlier I tried having both `XLA_IR_DEBUG=1` and `XLA_HLO_DEBUG=1` and it still cannot propagate...

XLA profiler issues: 1) TPU device trace does not show trace annotations, and 2) PRJT only captures 2 cores

Update on this: it seems that the **PJRT profiling no longer works on a pod in the torch_xla 20220916 wheel**, although it was working in the 20220829 torch_xla wheel. ---...

XLA profiler issues: 1) TPU device trace does not show trace annotations, and 2) PRJT only captures 2 cores

> @ronghanghu do you know if this still works on donut? Yes, the PJRT profiling still works on a donut -- I can capture the device trace with the script...