Image-Adaptive-3DLUT
Image-Adaptive-3DLUT copied to clipboard
Tensor device problems in code
In some situation, training code raises a device error:
cuda error: an illegal memory access was encountered
After debug I found that the main reason is Tensor
used during the training is not on the same device. For example:
In Generator3DLUT_identity
and Generator3DLUT_zero
, self.LUT.device
is cpu
. In TrilinearInterpolationFunction
, int_package
and float_package
are also on cpu
. However, the input and the output of the network are cuda tensor, causing sometimes device error occurs when running the model.
To solve the problem, it's better to init all tensors as the same and dynamic type, instead of initializing the tensor on fixed devices.
It's a good way to init tensor by:
tensor = torch.FloatTensor(...).type_as(input)
or
tensor = torch.FloatTensor(...).to(input.device)
which guarantees all tensors are on the same device.
In some situation, training code raises a device error:
cuda error: an illegal memory access was encountered
After debug I found that the main reason is
Tensor
used during the training is not on the same device. For example:In
Generator3DLUT_identity
andGenerator3DLUT_zero
,self.LUT.device
iscpu
. InTrilinearInterpolationFunction
,int_package
andfloat_package
are also oncpu
. However, the input and the output of the network are cuda tensor, causing sometimes device error occurs when running the model.To solve the problem, it's better to init all tensors as the same and dynamic type, instead of initializing the tensor on fixed devices.
Could you explain more? I met this problem,but I don't know how to do this. Do you mean that we can put the LUT and TrilinearInterpolationFunction on the gpu?