oneflow
oneflow copied to clipboard
oneflow.nn.Fold may produce inconsistent results when runned w/ and w/o requires_grad_(True)
Summary
oneflow.nn.Fold may produce inconsistent results when runned w/ and w/o requires_grad_(True).
The following code snippet produces count != 0
count = 0
input = oneflow.rand(2, 72, 16,dtype=oneflow.float64).cuda()
input_grad = input.clone().requires_grad_(True)
for i in range(100):
mod = oneflow.nn.Fold(dilation= 1, kernel_size= 3, output_size=[8, 8], padding=1, stride=2)
output = oneflow.sum(mod(input))
output_grad = oneflow.sum(mod(input_grad))
if not output==output_grad:
count += 1
print(count)
System Information
- What is your OneFlow installation (pip, source, dockerhub): pip
- OS: Ubuntu 20.04.2 LTS
- OneFlow version (run
python3 -m oneflow --doctor
): 0.7.0+cu112 - Python version: 3.8.8
- CUDA driver version: 11.4
I just follow your script, and get different result of count
. It seems like a Floating point error by accident, I think it is not a Bug? Or you can provide me a data to reproduce.
Since you compare the sum of float number, different operation order can cause different result. And I will recommend you to use np.allclose
to compare the result, it can set a appropriate tolerance like this:
np.allclose(output.numpy(), output_grad.numpy(), atol=1e-4, rtol=1e-4)
Thanks for your reply! However, the results are still inconsistent if oneflow.sum() is not used.
count = 0
input = oneflow.rand(2, 72, 16,dtype=oneflow.float64).cuda()
input_grad = input.clone().requires_grad_(True)
for i in range(1):
mod = oneflow.nn.Fold(dilation= 1, kernel_size= 3, output_size=[8, 8], padding=1, stride=2)
output = mod(input)
output_grad = mod(input_grad)
#print(output==output_grad)
if not (output==output_grad).all():
print(output-output_grad)
count += 1
print(count)
The difference between output and output_grad is printed: