torchsparse icon indicating copy to clipboard operation
torchsparse copied to clipboard

conv3d with empty kernel_map

Open cslxiao opened this issue 4 years ago • 12 comments

Applying convolution transpose with conv3d sometimes returns an error due to the empty of kernel_map. The reason is that sparseconv_op is called without checking kernel_map is empty or not.

cslxiao avatar Dec 31 '20 06:12 cslxiao

Thanks for pointing this out! I have created a PR for this issue.

zhijian-liu avatar Mar 08 '21 23:03 zhijian-liu

@cslxiao, could you please install the latest version to see whether the issue has been resolved.

zhijian-liu avatar Mar 09 '21 07:03 zhijian-liu

@zhijian-liu, good jod! But I think the bug is still there. in the code:

        # do upsample
        original_stride = int(cur_stride / stride)
        kernel_map = inputs.kernel_maps.get(
            'k%s_os%d_s%d_d%d' % (ks, original_stride, stride, dilation), None)
        output_features = sparseconv_op(features, kernel, kernel_map[0],
                                        kernel_map[1], kernel_map[2],
                                        transpose)
 

the error occurs at calling of sparseconv_op with kernel_map = None, because the kernel_maps of the input tensor doesn't necessarily contain the kernel_map required at the step when the network is not center-symmetric with regards to kernel_size, stride and dilation setting.

cslxiao avatar Mar 11 '21 02:03 cslxiao

Hi @cslxiao, thanks for the information. Could you please provide me with a minimal example to reproduce this? Thanks!

zhijian-liu avatar Mar 11 '21 04:03 zhijian-liu

Hi, @zhijian-liu take the modified minkunet in repo e3d as an example,


class DilatedMinkUNet(MinkUNet):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)

        cr = kwargs.get('cr', 1.0)
        dilation = kwargs.get('dilation', [1, 1, 1, 2, 2, 1, 1, 1])
        cs = [32, 32, 64, 128, 256, 256, 128, 96, 96]
        cs = [int(cr * x) for x in cs]
        self.cs = cs
        self.run_up = kwargs.get('run_up', True)
        
        self.stem = nn.Sequential(
            spnn.Conv3d(4, cs[0], kernel_size=3, stride=1),
            spnn.BatchNorm(cs[0]), spnn.ReLU(True),
            spnn.Conv3d(cs[0], cs[0], kernel_size=3, stride=1),
            spnn.BatchNorm(cs[0]), spnn.ReLU(True))

        self.stage1 = nn.Sequential(
            BasicConvolutionBlock(cs[0], cs[0], ks=2, stride=2, dilation=dilation[0]),
            ResidualBlock(cs[0], cs[1], ks=3, stride=1, dilation=dilation[0]),
            ResidualBlock(cs[1], cs[1], ks=3, stride=1, dilation=dilation[0]),
        )

        self.stage2 = nn.Sequential(
            BasicConvolutionBlock(cs[1], cs[1], ks=2, stride=2, dilation=dilation[1]),
            ResidualBlock(cs[1], cs[2], ks=3, stride=1, dilation=dilation[1]),
            ResidualBlock(cs[2], cs[2], ks=3, stride=1, dilation=dilation[1]),
        )

        self.stage3 = nn.Sequential(
            BasicConvolutionBlock(cs[2], cs[2], ks=2, stride=2, dilation=dilation[2]),
            ResidualBlock(cs[2], cs[3], ks=3, stride=1, dilation=dilation[2]),
            ResidualBlock(cs[3], cs[3], ks=3, stride=1, dilation=dilation[2]),
        )

        self.stage4 = nn.Sequential(
            BasicConvolutionBlock(cs[3], cs[3], ks=2, stride=2, dilation=dilation[3]),
            ResidualBlock(cs[3], cs[4], ks=3, stride=1, dilation=dilation[3]),
            ResidualBlock(cs[4], cs[4], ks=3, stride=1, dilation=dilation[3]),
        )

        self.up1 = nn.ModuleList([
            BasicDeconvolutionBlock(cs[4], cs[5], ks=2, stride=2, dilation=dilation[4]),
            nn.Sequential(
                ResidualBlock(cs[5] + cs[3], cs[5], ks=3, stride=1, dilation=dilation[4]),
                ResidualBlock(cs[5], cs[5], ks=3, stride=1, dilation=dilation[4]),
            )
        ])

        self.up2 = nn.ModuleList([
            BasicDeconvolutionBlock(cs[5], cs[6], ks=2, stride=2, dilation=dilation[5]),
            nn.Sequential(
                ResidualBlock(cs[6] + cs[2], cs[6], ks=3, stride=1, dilation=dilation[5]),
                ResidualBlock(cs[6], cs[6], ks=3, stride=1, dilation=dilation[5]),
            )
        ])

        self.up3 = nn.ModuleList([
            BasicDeconvolutionBlock(cs[6], cs[7], ks=2, stride=2, dilation=dilation[6]),
            nn.Sequential(
                ResidualBlock(cs[7] + cs[1], cs[7], ks=3, stride=1, dilation=dilation[6]),
                ResidualBlock(cs[7], cs[7], ks=3, stride=1, dilation=dilation[6]),
            )
        ])

        self.up4 = nn.ModuleList([
            BasicDeconvolutionBlock(cs[7], cs[8], ks=2, stride=2, dilation=dilation[7]),
            nn.Sequential(
                ResidualBlock(cs[8] + cs[0], cs[8], ks=3, stride=1, dilation=dilation[7]),
                ResidualBlock(cs[8], cs[8], ks=3, stride=1, dilation=dilation[7]),
            )
        ])

symmetric dilation parameters like dilation: [1, 1, 1, 2, 2, 1, 1, 1] or dilation: [1, 2, 2, 2, 2, 2, 2, 1] is ok. But asymmetric dilation parameters will cause the aforementioned error, such as dilation: [1, 1, 1, 1, 1, 1, 1, 2].

cslxiao avatar Mar 11 '21 13:03 cslxiao

Same question here. Do you have the plan to support directly inverse conv3d?

CurryYuan avatar Apr 17 '21 03:04 CurryYuan

I think what you need here is a generative sparse deconvolution. This is a bit different from what we have implemented now. We will investigate this in more detail.

zhijian-liu avatar Apr 26 '21 19:04 zhijian-liu

I am still having this issue with v1.4.0

import torch

import torchsparse
import torchsparse.nn

feat_depth = 64

coords = torch.zeros((0, 4), dtype=torch.int32, device='cuda')
feats = torch.zeros((0, feat_depth), dtype=torch.float32, device='cuda')
t = torchsparse.SparseTensor(feats, coords)

conv = torchsparse.nn.Conv3d(feat_depth, feat_depth, kernel_size=3, bias=False)

conv(t)

RuntimeError: CUDA error: invalid configuration argument

noahstier avatar Jul 01 '21 21:07 noahstier

I think I got the same issue here. Is this related to generalized Sparse Tensor?

/tmp/ipykernel_27426/3796938854.py in forward(self, x)
    198         x = to_sparse(x)
    199 
--> 200         x0 = self.stem(x)
    201         x1 = self.stage1(x0)
    202         x2 = self.stage2(x1)

~/.conda/envs/torchsparse/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

~/.conda/envs/torchsparse/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
    137     def forward(self, input):
    138         for module in self:
--> 139             input = module(input)
    140         return input
    141 

~/.conda/envs/torchsparse/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

~/.conda/envs/torchsparse/lib/python3.8/site-packages/torchsparse-1.4.0-py3.8-linux-x86_64.egg/torchsparse/nn/modules/conv.py in forward(self, input)
     64 
     65     def forward(self, input: SparseTensor) -> SparseTensor:
---> 66         return F.conv3d(input,
     67                         self.kernel,
     68                         kernel_size=self.kernel_size,

~/.conda/envs/torchsparse/lib/python3.8/site-packages/torchsparse-1.4.0-py3.8-linux-x86_64.egg/torchsparse/nn/functional/conv.py in conv3d(input, weight, kernel_size, bias, stride, dilation, transposed)
    121                                         input.stride)
    122             queries = F.sphash(coords, offsets)
--> 123             results = F.sphashquery(queries, references)
    124 
    125             nbsizes = torch.sum(results != -1, dim=1)

~/.conda/envs/torchsparse/lib/python3.8/site-packages/torchsparse-1.4.0-py3.8-linux-x86_64.egg/torchsparse/nn/functional/query.py in sphashquery(queries, references)
     19 
     20     if queries.device.type == 'cuda':
---> 21         output = torchsparse.backend.hash_query_cuda(queries, references,
     22                                                      indices)
     23     elif queries.device.type == 'cpu':

RuntimeError: CUDA error: invalid configuration argument
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

resuly avatar Sep 29 '21 20:09 resuly

@resuly, could you please check the shape of x (both coords and feats) before the line of x0 = self.stem(x)?

zhijian-liu avatar Oct 01 '21 22:10 zhijian-liu

@resuly, could you please check the shape of x (both coords and feats) before the line of x0 = self.stem(x)?

Sorry for the late reply. It should be the same zero inputs issue as @noahstier mentioned above.

resuly avatar Nov 01 '21 04:11 resuly

@resuly, in this case, you should check what the input size is (before sending into the model). You need to make sure that the input has more than one point.

zhijian-liu avatar Nov 05 '21 21:11 zhijian-liu

Since TorchSparse has been upgraded to v2.1.0, could you please attempt to install the latest version? I will now close this issue, but please don't hesitate to reopen it if the problem persists.

zhijian-liu avatar Jul 15 '23 01:07 zhijian-liu