pytorch-tree-lstm icon indicating copy to clipboard operation
pytorch-tree-lstm copied to clipboard

in-place operations?

Open michael-hahn opened this issue 5 years ago • 4 comments

Thank you for your implementation of PyTorch Tree LSTM.

In your model, however, it seems like there are many in-place operations? Will this cause any problems with autograd?

For example, in tree_lstm.py:

        if iteration == 0:
            c[node_mask, :] = i * u

When I ran the code with PyTorch 1.2 (Preview, Nightly), I received the following error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [1705, 64]], which is output 0 of SliceBackward, is at version 6177; expected version 5946 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient.

michael-hahn avatar Jun 21 '19 17:06 michael-hahn

Very interesting. We have not run this code in PyTorch-1.2; our tests in PyTorch-1.0 seemed to show a complete autograd graph and we thought it was computing properly. (Also, we were able to fit the model to our data, so the autograd must be approximately correct.) But we may be doing it wrong; I'll install PyTorch-1.2 and see if it has new diagnostic checks or what is going on.

freedryk avatar Jun 24 '19 16:06 freedryk

Thanks! The code ran correctly in PyTorch 1.1.0 without raising this run-time error as well.

I am also wondering if there is certain guarantee provided by the structure of the data, for my input data sometimes contains DAGs instead of Trees. However, since DAG nodes can also be totally (and topologically) ordered, I don't see why it may cause any trouble.

michael-hahn avatar Jun 24 '19 16:06 michael-hahn

Yeah, DAGs should work the same as Trees I think. It may just be that we misunderstood how in place updates worked. We were worried about this as a possible issue so we did some investigation and generated some autograd graphs to see what it was doing. I'll see if I can generate a version of the code that doesn't rely on in-place updating...

freedryk avatar Jun 24 '19 17:06 freedryk

Thanks for your update!

I am quite a newbie to PyTorch and ML, so if you don't mind me asking, what tool are you using to generate autograd graphs? I tried to use tensorboardX but it does not seem to work well. (I received some error that does not seem to have a fix at the moment.) The reason I ask this is that when I ran your code on my data (using PyTorch 1.1.0 with no run-time error regarding in-place operation), I ran into NaN gradient issues reported by PyTorch's anomaly detection, so I was wondering if autograd graph can help debug.

Thank you so much.

michael-hahn avatar Jun 24 '19 17:06 michael-hahn