cutlass
cutlass copied to clipboard
[QST] CUDA free failed when executing example 59_ampere_gather_scatter_conv
What is your question? Hi, I am modifying the predefined variables of 59_ampere_gather_scatter_conv, these are my new values:
using D = _1;
using H = _4;
using W = _4;
using T = _1;
using R = _1;
using S = _1;
using Z = _1;
using P = _2;
using Q = _2;
using C = _32;
using K = _32;
And here the result of the by-default execution:
izcagal@cmts10:~/cutlass/build$ ./examples/59_ampere_gather_scatter_conv/59_ampere_gather_scatter_conv --no-check
Ampere convolution forward propogation kernel supporting both affine and gather/scatter tensors.
Filter layout ( K, (C,T,R,S)) = (_32,(_32,_1,_1,_1)):(_32,(_1,_0,_0,_0))
Activation layout ((N,D,H,W), (C,1,1,1)) = ((4320,_1,_4,_4),(_32,_1,_1,_1)):((_512,_512,_128,_32),(_1,_0,_0,_0))
Output layout ( K, (N,Z,P,Q)) = (_32,(4320,_4,_2,_2)):(_1,(_512,_128,_64,_32))
Allocating tensors ... done.
Initializing data ... done.
Running dense fprop kernel
xformed act layout ((N,Z,P,Q), (C,T,R,S)) = ((4320,_4,_2,_2),(_32,_1,_1,_1)):((_512,_512,_128,_32),(_1,_512,_128,_32))
CUDA error at (/mnt/beegfs/gap/izcagal/cutlass/examples/59_ampere_gather_scatter_conv/ampere_gather_scatter_conv.cu,155)
700 -- an illegal memory access was encountered
Conv TFLOP count = 0.000142
Conv dense perf: 0.000000ms | TFLOP/s = inf
terminate called after throwing an instance of 'thrust::system::system_error'
what(): CUDA free failed: cudaErrorIllegalAddress: an illegal memory access was encountered
Aborted (core dumped)
However, when I execute with --n=128, for some reason, it does not fail. I would like to know why this is happening. I am sure there are some restrictions about the variables, but the only one I found in the Convolution Implementation was that C or K must be a multiple of 32.
Any help would be appreciated.
Thanks.