ContinuousConv on CUDA returns 0.0 (Open3D 0.18)
Checklist
- [X] I have searched for similar issues.
- [X] I have tested with the latest development wheel.
- [X] I have checked the release documentation and the latest documentation (for
mainbranch).
Describe the issue
On 0.17, CPU and GPU implementations for FixedRadiusSearch return the same result, which make CConv work as intented.
On 0.18, CPU works but GPU returns empty results. This makes CConv return 0.0 when using CUDA.
Steps to reproduce the bug
import torch
import open3d.ml.torch as t3d
torch.set_default_device('cuda')
inp_positions = torch.randn([20,3])
inp_features = torch.randn([20,4])
out_positions = torch.randn([10,3])
conv = t3d.layers.ContinuousConv(
in_channels=4,
filters=4,
kernel_size=[2,2,2],
)
res = conv(inp_features, inp_positions, out_positions, extents=2.0)
res_cuda = conv.cuda()(inp_features.cuda(), inp_positions.cuda(), out_positions.cuda(), extents=2.0)
resmin, resmax = res.min().item(), res.max().item()#, res_cuda.min().item(), res_cuda.max().item()
print(f"{resmin=}\n{resmax=}\n{rescmin=}\n{rescmax=}")
Error message
resmin=<different from 0> resmax=<different from 0> rescmin=0.0 rescmax=0.0
Expected behavior
As with Open3D 0.17, FixedRadiusSearch should work properly on CUDA.
Open3D, Python and System information
- Operating system: Ubuntu 22.04
- Python version: 3.10
- Open3D version: 0.18
- System type: x86
- Is this remote workstation?: yes
- How did you install Open3D?: pip (clean conda environment, only Torch & Open3D)
Additional information
No response
The issue more precisely seems to come from build_spatial_hash_table.
Given the following point cloud and radius:
import torch
import open3d.ml.torch as ml3d
points = torch.Tensor([
[0.1,0.1,0.1],
[0.5,0.5,0.5],
[1.7,1.7,1.7],
[1.8,1.8,1.8],
[0.3,2.4,1.4]])
radius = 1.0
The respective codes return:
table = ml3d.ops.build_spatial_hash_table(points,
radius,
points_row_splits=torch.LongTensor([0,5]),
hash_table_size_factor=1/64)
build_spatial_hash_table(hash_table_index=tensor([0, 1, 2, 3, 4], dtype=torch.int32), hash_table_cell_splits=tensor([0, 5], dtype=torch.int32), hash_table_splits=tensor([0, 1], dtype=torch.int32))
and on CUDA
table = ml3d.ops.build_spatial_hash_table(points.cuda(),
radius,
points_row_splits=torch.LongTensor([0,5]),
hash_table_size_factor=1/64)
build_spatial_hash_table(hash_table_index=tensor([0, 0, 0, 0, 0], device='cuda:0', dtype=torch.int32), hash_table_cell_splits=tensor([0, 0], device='cuda:0', dtype=torch.int32), hash_table_splits=tensor([0, 1], dtype=torch.int32))
It even sometimes return things like
build_spatial_hash_table(hash_table_index=tensor([1065353216, 1065353216, 1065353216, 1056964608, 1073741824], device='cuda:0', dtype=torch.int32), hash_table_cell_splits=tensor([0, 0], device='cuda:0', dtype=torch.int32), hash_table_splits=tensor([0, 1], dtype=torch.int32)) which obviously cause overflow issues if used later in (fixed) radius search for ContinuousConv.