Getting NaNs... manual initialization required?
@pvjosue I tried swapping this in for conv2D and conv3D in a Unet. I am getting the correct shape but all NaNs as output. Do kernel_initializer and bias_initializer need to be set manually?
Yes, i recently observed this as well, and will investigate. My guess is there something changed in the latest pytorch release. Thanks
@pvjosue Have you found a solution to this with the latest version of pytorch? Did they deprecate this ability in the latest release? This is exactly the type of package I was looking for. I would be willing to use an older version of pytorch to use this.
Let's see, with user base initialization, it seems to work out of the box.
GPU Ubuntu:

CPU Ubuntu:

The problem is the default initialization, which I could fix for convND: https://github.com/pvjosue/pytorch_convNd/commit/2b0e4dbd658f25a18efd8cd848f644c0e9eef29a I'll leave the issue open as a solution for the transposed conv is still missing. Thanks for letting me know :)
For me, the culprit was the bias. In the initialization of convNd the bias
if use_bias:
self.bias = nn.Parameter(torch.Tensor(out_channels))
else:
self.register_parameter('bias', None)
According to my understanding, one should not use torch.Tensor to initialize a tensor as it might access unallocated or improperly initialized memory. A simple example
>>> import torch
>>> import torch.nn as nn
>>> bias = nn.Parameter(torch.Tensor(10))
>>> type(bias)
<class 'torch.nn.parameter.Parameter'>
>>> bias.data
tensor([-1.0836e+10, 2.6007e-36, 6.4800e+24, 4.5593e-41, 4.5549e+24,
4.5593e-41, 1.8760e-16, nan, 6.4629e+24, 4.5593e-41])
After switching it to
if use_bias:
self.bias = nn.Parameter(torch.zeros(out_channels))
else:
self.register_parameter('bias', None)
everything works as expected. It is probably not an optimal initialization, but if I understand correctly, a user can pass a suitable one via the bias_initializer keyword argument.
I added a (very simple) fix. Please note it might not be optimal to use for everyone, but it suffices in my network. We can have discussions on this in the merge-request.