slang-python
slang-python copied to clipboard
Accumulating gradient_output in the BRDF example
Hello, I was wondering if the gradient_output in the BRDF example perhaps needs a zero_() in the learning inner loop (ie. before calling m.brdf_loss.bwd) ? Similar to calling optimizer.zero_grad() in PyTorch.
Otherwise, wouldn't the code accumulate the gradient with each sample, while also immediately applying it?
Apologies if this is intentional, or a zero-fill is already implied somewhere, or I misunderstood :)
Thanks! bert
PS Sorry, I don't have Jupyter set up to test a merge request.
for i in range(10000):
L = random_hemi_vector()
V = (0.0, 0.0, 1.0)
input_params = (*L, *V)
loss_output = torch.zeros((original_shape[0], original_shape[1], 1)).cuda()
output_grad = torch.ones_like(loss_output).cuda()
m.brdf(input=full_res_brdf,
output=lighting_from_full_res_brdf,
input_params=input_params).launchRaw(blockSize=block_size, gridSize=grid_size)
gradient_output.zero_() # ++++++++++++++++++++++
m.brdf_loss.bwd(input=(half_res_brdf, gradient_output),
output=(loss_output, output_grad),
reference=lighting_from_full_res_brdf,
input_params=input_params).launchRaw(blockSize=block_size, gridSize=grid_size)
# Clip gradients and prevent.
gradient_output = torch.nan_to_num(gradient_output, 0.0)
gradient_output = torch.clamp(gradient_output, -1.0, 1.0)
half_res_brdf = torch.clip(half_res_brdf - 0.001 * gradient_output, 0.0001, 1.0)