MinkowskiEngine What is Voxel Size and how to choose the correct number?

trafficstars

In the semantic segmentation example indoor.py, line 138, there is a hyperparameter called voxel_size. The original number of 0.02, and I have tried different numbers for that. It looks like voxel_size has a significant affect on the model performance. So what is this number and how to choose a correct number for that?

Jun 27 '21 23:06 zhaopku

Voxel size determines the resolution of the space.

Let's say that we have a 100m x 100m x 25m LIDAR scan. If we select voxel size to be

voxel size	resolution
1m	100 x 100 x 25
50cm	200 x 200 x 50
5cm	2000 x 2000 x 500

The network can see all the details if you use small voxel size, but will be slower accordingly. This is the same for 2D cnns you are familiar with, high resolution images require more computation.

Similarly, you can't expect a 2D CNN trained on 100x100 images to work well on 10x10 images or 1000x1000 images at test time. You have to train with the specified resolution / voxel size to use that in the test.

Jul 01 '21 18:07 chrischoy

Thanks for your explanation. I'm trying to understand this a bit more clearly:

def create_input_batch(batch, is_minknet, device="cuda", quantization_size=0.05):
    if is_minknet:
        print("pre", batch["coordinates"][:, 1:])
        print("pre", batch["coordinates"][:, 1:].shape)
        batch["coordinates"][:, 1:] = batch["coordinates"][:, 1:] / quantization_size
        print('post', batch["coordinates"][:, 1:])
        print('post', batch["coordinates"][:, 1:].shape)
        return ME.TensorField(
            coordinates=batch["coordinates"],
            features=batch["features"],
            device=device,
        )
    else:
        return batch["coordinates"].permute(0, 2, 1).to(device)

here is the output with quantization_size = 0.5

pre tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  1.],
        [ 0.,  0.,  2.],
        ...,
        [49., 49., 47.],
        [49., 49., 48.],
        [49., 49., 49.]])
pre torch.Size([1000000, 3])
post tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  2.],
        [ 0.,  0.,  4.],
        ...,
        [98., 98., 94.],
        [98., 98., 96.],
        [98., 98., 98.]])
post torch.Size([1000000, 3])

so voxel_size = 0.5 does not impact the num_points in a batch.

But it has changed the coords from maximum (x,y,z) locations from 49,49,49 to 98,98,98.

It's confusing because before I had (8, 50x50x50) = 1000000 points and after quantization, I still have the same 1000000 points.

Does that mean that we are now sampling 1000000 points from high-resolution 3D data?

Jul 09 '21 07:07 asadabbas09

The reason you got the same number is because you used TensorField. This is a wrapper for continuous point cloud. You can use .sparse() to convert TensorField to SparseTensor which will show you unique number of coordinates.

Jul 10 '21 04:07 chrischoy

MinkowskiEngine MinkowskiEngine copied to clipboard

What is Voxel Size and how to choose the correct number?

MinkowskiEngine
MinkowskiEngine copied to clipboard