enoki
enoki copied to clipboard
Null gradient when turning free_graph off
The following snippet prints null gradients while if using backward(c, true) we get the right value (5.0, 2.0):
using FloatD = DiffArray<float>;
FloatD a = 2.0f;
FloatD b = 5.0f;
set_requires_gradient(a);
set_requires_gradient(b);
FloatD c = a * b;
backward(c, false);
LOG << "dc/da = " << gradient(a);
LOG << "dc/db = " << gradient(b);
Output:
dc/da = 0
dc/db = 0
Expected:
dc/da = 5.0
dc/db = 2.0
Built with MSVC16, without CUDA, commit e240a4b
edit: The line canceling the gradients is this one: https://github.com/mitsuba-renderer/enoki/blob/master/src/autodiff/autodiff.cpp#L896 I am not sure what this reference counter is, but shouldn't the condition be if (target.ref_count_int == 0) rather than > 0 ?
Hi @eliemichel ,
I doubt DiffArray<float>. IIRC automatic differentiation in enoki is only supported for CUDAArray. Do you have the same issue when using DiffArray<CUDAArray<float>> instead?
Regarding the other issue I don't have precise ideas but for this one what do you think about the suggested fix of changing line 896 of autodiff.cpp? Did I misunderstand the meaning of this ref_count_int?
i could reproduce the problem with the "Interfacing with PyTorch" example from the documentation, just modifying FloatD.backward() to FloatD.backward(free_graph=False). I also added imports for FloatC and FloatD as the example did not run out of the box, but i guess that is unrelated. I ended up here after trying to modify the mitsuba autodiff function render_torch to not whipe the AD graph.
import torch
import enoki
from enoki.cuda_autodiff import Float32 as FloatD
from enoki.cuda import Float32 as FloatC
class EnokiAtan2(torch.autograd.Function):
@staticmethod
def forward(ctx, arg1, arg2):
# Convert input parameters to Enoki arrays
ctx.in1 = FloatD(arg1)
ctx.in2 = FloatD(arg2)
# Inform Enoki if PyTorch wants gradients for one/both of them
enoki.set_requires_gradient(ctx.in1, arg1.requires_grad)
enoki.set_requires_gradient(ctx.in2, arg2.requires_grad)
# Perform a differentiable computation in ENoki
ctx.out = enoki.atan2(ctx.in1, ctx.in2)
# Convert the result back into a PyTorch array
out_torch = ctx.out.torch()
# Optional: release any cached memory from Enoki back to PyTorch
enoki.cuda_malloc_trim()
return out_torch
@staticmethod
def backward(ctx, grad_out):
# Attach gradients received from PyTorch to the output
# variable of the forward pass
enoki.set_gradient(ctx.out, FloatC(grad_out))
# Perform a reverse-mode traversal. Note that the static
# version of the backward() function is being used, see
# the following subsection for details on this
FloatD.backward(free_graph=False)
# Fetch gradients from the input variables and pass them on
result = (enoki.gradient(ctx.in1).torch()
if enoki.requires_gradient(ctx.in1) else None,
enoki.gradient(ctx.in2).torch()
if enoki.requires_gradient(ctx.in2) else None)
# Garbage-collect Enoki arrays that are now no longer needed
del ctx.out, ctx.in1, ctx.in2
# Optional: release any cached memory from Enoki back to PyTorch
enoki.cuda_malloc_trim()
return result
# Create enoki_atan2(y, x) function
enoki_atan2 = EnokiAtan2.apply
# Let's try it!
y = torch.tensor(1.0, device='cuda')
x = torch.tensor(2.0, device='cuda')
y.requires_grad_()
x.requires_grad_()
o = enoki_atan2(y, x)
print(o)
o.backward()
print(y.grad)
print(x.grad)
The modified example prints:
tensor([0.4636], device='cuda:0', grad_fn=<EnokiAtan2Backward>) tensor(0., device='cuda:0') tensor(0., device='cuda:0')
Whereas the unmodified example prints:
tensor([0.4636], device='cuda:0', grad_fn=<EnokiAtan2Backward>) tensor(0.4000, device='cuda:0') tensor(-0.2000, device='cuda:0')