Ronghang Hu

Results 46 comments of Ronghang Hu

A follow-up on this: the underlying cause here seems to be that autograd treats `.t()` + `.mm()` differently between native PyTorch and PyTorch/XLA. ```python3 import os os.environ["XLA_IR_DEBUG"] = "1" import...

> Thanks ronghang, I will try to take a look soon. Thanks, Jack! For now, we can patch this discrepancy via `torch.nn.functional.linear = xla_patched_linear` in https://gist.github.com/ronghanghu/d82ede74c434f8c12ae3ffb65ec84b45, so it's not blocking...

> _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE Looks like `tensorflow::strings::StrCat` is missing. This error seems to come from the string interface change in GCC from GCC 4.x to 5.x. https://developers.redhat.com/blog/2015/02/05/gcc5-and-the-c11-abi/

A tricky workaround I'm using is to split the forward and backward into multiple `sess.run`, and glue them together with some auxiliary variables. For example, the first `sess.run` can store...

@MycChiu Yes, that's was a big headache for me. My tricky implementation involves anther forward pass when calculating the gradients, and to make it work I had to remove all...

Also, `sess.partial_run` may be a solution, but somehow I couldn't get it work.

My model involves first generating a parsing tree from a neural parser (implemented in TensorFlow) and a TreeLSTM running over the previous parsing tree (also implemented in TensorFlow, using Fold)....

@JackCaoG Thanks for looking into this! > A quick workaround is to also set XLA_IR_DEBUG=1 during profiling. Earlier I tried having both `XLA_IR_DEBUG=1` and `XLA_HLO_DEBUG=1` and it still cannot propagate...

Update on this: it seems that the **PJRT profiling no longer works on a pod in the torch_xla 20220916 wheel**, although it was working in the 20220829 torch_xla wheel. ---...

> @ronghanghu do you know if this still works on donut? Yes, the PJRT profiling still works on a donut -- I can capture the device trace with the script...