D3Dnet icon indicating copy to clipboard operation
D3Dnet copied to clipboard

Segmentation fault (core dumped)

Open rongleiji opened this issue 5 years ago • 8 comments

Hi Xinyi, Thanks for your great work! I met segmentation fault when I am trying to train D3D model. The DCN is installed as follows: Installed /scratch/users/rji19/.conda/envs/pytorch1.2.0/lib/python3.7/site-packages/D3D-1.0-py3.7-linux-x86_64.egg Processing dependencies for D3D==1.0 Finished processing dependencies for D3D==1.0

I think DCN has no problem but when I'm training the process cannot pass this function output = DCN.deform_conv_forward(). So I think there is must something wrong in DCN that I don't know. Could you provide some cues about this problem ? Thank you

rongleiji avatar Jul 18 '20 16:07 rongleiji

I have the same question. Have you solved it?

CNHNLP avatar Jan 07 '21 11:01 CNHNLP

I have the same question. Have you solved it?

Not yet

rongleiji avatar Jan 07 '21 12:01 rongleiji

I have the same question. Have you solved it?

Not yet

I think maybe cuda version. My cuda10.0 is failed

CNHNLP avatar Jan 07 '21 12:01 CNHNLP

I have the same question. Have you solved it?

Not yet

I think maybe cuda version. My cuda10.0 is failed

I don't know. I tried different cuda versions.

rongleiji avatar Jan 07 '21 12:01 rongleiji

Same problem here.

 File "/home/sysgen/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sysgen/gitlab/nnUNet/nnunet/network_architecture/deformable_UNet.py", line 420, in forward
    x_ = torch.cat((x_, self.conv_blocks_context[d][1](splited_2)), dim = 1 )
  File "/home/sysgen/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sysgen/gitlab/nnUNet/nnunet/network_architecture/custom_modules/deformconv3d.py", line 174, in forward
    out = self.dcn1(x)
  File "/home/sysgen/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sysgen/gitlab/nnUNet/nnunet/network_architecture/custom_modules/deformconv3d.py", line 146, in forward
    self.im2col_step)
  File "/home/sysgen/gitlab/nnUNet/nnunet/network_architecture/custom_modules/deformconv3d.py", line 35, in forward
    ctx.im2col_step)
RuntimeError: input.is_contiguous() INTERNAL ASSERT FAILED at "/home/sysgen/gitlab/D3Dnet/code/dcn/src/cuda/deform_conv_cuda.cu":41, please report a bug to PyTorch. in                          put tensor has to be contiguous
Segmentation fault (core dumped)

BBQtime avatar Jan 25 '21 22:01 BBQtime

Have you tried to reduce im2col_step? batch_size must be dividable by im2col_step.

BBQtime avatar Jan 25 '21 23:01 BBQtime

deform_conv_cuda.cu checks if the input tensor is contiguous stored in memory. Did you tried to call tensor.contiguous() before calling the DCN module?

ChristophReich1996 avatar Feb 09 '21 02:02 ChristophReich1996

Yes, it is necessary

On Tue, Feb 9, 2021 at 5:19 AM Christoph Reich [email protected] wrote:

deform_conv_cuda.cu checks if the input tensor is contiguous stored in memory. Did you tried to call tensor.contiguous() https://pytorch.org/docs/stable/tensors.html?highlight=contiguous#torch.Tensor.contiguous before calling the DCN module?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/XinyiYing/D3Dnet/issues/5#issuecomment-775605612, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOOV2BK7AJTKAZ55UOAXG6DS6CLTJANCNFSM4PADQWBA .

rongleiji avatar Feb 09 '21 11:02 rongleiji