advertorch icon indicating copy to clipboard operation
advertorch copied to clipboard

Printing attack adversaries in a verbose mode, for better debugging purpose

Open stefan-matcovici opened this issue 5 years ago • 4 comments

Full stacktrace:

/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [64,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [65,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [66,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [67,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [68,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
............................
/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [30,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [31,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
Traceback (most recent call last):
  File "adversarial_test_model.py", line 56, in <module>
    advdata = adversary.perturb(clndata, target)
  File "/nfshomes/smatcovi/works/pilot-project/ppenv/lib/python3.7/site-packages/advertorch/attacks/carlini_wagner.py", line 209, in perturb
    final_l2distsqs = torch.FloatTensor(final_l2distsqs).to(x.device)
RuntimeError: CUDA error: device-side assert triggered

I am trying to use the CarliniWagnerL2Attack. I am using the code tutorial_train_mnist.py but for checking robust accuracy. When I run locally (GTX1050) everything works. When I try to run on a nvidia K20 or M40 I get this error.

stefan-matcovici avatar Jul 13 '19 18:07 stefan-matcovici

After running with CUDA_LAUNCH_BLOCKING=1, I might get a more specific stacktrace:

Traceback (most recent call last):
  File "adversarial_test_model.py", line 56, in <module>
    advdata = adversary.perturb(clndata, target)
  File "/nfshomes/smatcovi/works/pilot-project/ppenv/lib/python3.7/site-packages/advertorch/attacks/carlini_wagner.py", line 207, in perturb
    y_onehot = to_one_hot(y, self.num_classes).float()
  File "/nfshomes/smatcovi/works/pilot-project/ppenv/lib/python3.7/site-packages/advertorch/utils.py", line 83, in to_one_hot
    y_one_hot = y.new_zeros((y.size()[0], num_classes)).scatter_(1, y, 1)

stefan-matcovici avatar Jul 13 '19 18:07 stefan-matcovici

I did some search, and found something like the following https://github.com/fastai/fastai/issues/440 https://forums.fast.ai/t/pytorch-not-working-with-an-old-nvidia-card/14632/17 https://discuss.pytorch.org/t/pytorch-no-longer-supports-this-gpu-because-it-is-too-old/13803 https://en.wikipedia.org/wiki/Nvidia_Tesla

It seems that pre-complied pytorch does not work old nvidia gpus, and compile from scratch might make it work. But I don't have an old gpu in hand so cannot do this test myself.

gwding avatar Jul 13 '19 23:07 gwding

I might have sent you on the wrong track. Sorry about that. I was calling the attack with True as an argument for the num_of_classes. Maybe this is an issue after all because this wasn't caught until the GPU computations. Also, as a future feature, I think it would be a great idea to add some logging about the attack like how many iterations, the current loss, etc.

That being said, you may close the issue and sorry one more time

stefan-matcovici avatar Jul 14 '19 03:07 stefan-matcovici

@stefan-matcovici Thanks for the suggestion. I changed the title of this issue, and labeled it as a todo item.

gwding avatar Jul 15 '19 18:07 gwding