Multiple Hashgrids
I am trying to use different hashgrids for spatially separated points but encountering strange values from the hash_grid_point_id function.
I have created an MRE below that shows the issue.
import warp as wp
import numpy as np
# wp.config.mode = "debug"
wp.init()
n_envs = 2
n_points = 2
offset = np.array([10.0, 0.0, 0.0])
r = 0.1
pos = np.random.rand(n_envs, n_points, 3)
grids = []
for i in range(n_envs):
grid = wp.HashGrid(20, 20, 20)
pos[i, ...] += i * offset
grid.build(wp.array(pos[i, ...], dtype=wp.vec3), radius=r)
grids.append(grid.id)
p = wp.array2d(pos, dtype=wp.vec3)
g = wp.array(grids, dtype=wp.uint64)
print("grids", g)
@wp.kernel
def hashgrid_tester(
grids: wp.array(dtype=wp.uint64),
pos: wp.array2d(dtype=wp.vec3),
):
for env in range(n_envs):
for tid in range(n_points):
grid = grids[env]
i = wp.hash_grid_point_id(grid, tid)
if i == -1:
return
wp.printf(
"env %d, tid %d, i %d, grid %lld \n",
env,
tid,
i,
grid,
)
wp.launch(hashgrid_tester, dim=[1], inputs=[g, p])
wp.synchronize()
print output:
grids [12918456320 12918521344]
env 0, tid 0, i 117, grid 12918456320
env 0, tid 1, i 479, grid 12918456320
env 1, tid 0, i 1, grid 12918521344
env 1, tid 1, i 0, grid 12918521344
The indices for the second grid get reordered correctly, but the values for the first grid seem to be random.
When using more than 2 points, a Warp CUDA error 716: misaligned address gets thrown, suggesting that the randomness might be from some sort of out of bounds unaligned read.
For the meantime I am using a single HashGrid for the whole domain, but am wondering if multiple instances of the hashgrid are supported and if this is the correct way.
Thanks
Thanks a lot for the repro @steinraf, I've not seen this before - we will investigate and get back to you.
@steinraf Sorry for the extremely late reply. I examined your reproducer and believe the issue comes down to the wp.HashGrid from the first iteration being garbage-collected before the hashgrid_tester kernel runs. This led to issues since the wp.HashGrid makes some additional memory allocations which are freed when the object is destroyed:
https://github.com/NVIDIA/warp/blob/17f166e36e60c731f3978ecc3876ccf02c0d7ccb/warp/native/hashgrid.cpp#L206-L222
In the original code, only the grid.id was stored, which wasn't enough to keep the wp.HashGrid from being garbage collected. The workaround is to make sure we keep the hash grid objects from previous iterations alive:
grids = []
for i in range(n_envs):
grid = wp.HashGrid(20, 20, 20)
pos[i, ...] += i * offset
grid.build(wp.array(pos[i, ...], dtype=wp.vec3), radius=r)
grids.append(grid) # Changed
p = wp.array2d(pos, dtype=wp.vec3)
g = wp.array([grid.id for grid in grids], dtype=wp.uint64) # Changed
Then we get:
env 0, tid 0, i 0, grid 17213423616
env 0, tid 1, i 1, grid 17213423616
env 1, tid 0, i 1, grid 17213488640
env 1, tid 1, i 0, grid 17213488640
Debugging strategy:
- Ran
compute-sanitizer --tool memcheck python test_0.py, no errors - Ran
compute-sanitizer --tool initcheck python test_0.py, gotUninitialized __global__ memory read of size 8 bytes at hashgrid_tester_5e2ff1ed_cuda_kernel_forward+0x1b0 - Added to top of script
wp.config.mode = "debug",wp.config.lineinfo = True,wp.config.line_directives = False - Ran 'compute-sanitizer' again and now see
Uninitialized __global__ memory read of size 8 bytes at wp::hash_grid_point_id(unsigned long long, int &)+0x15d0 in hashgrid.h:230
I think I'll clarify our docs to warn about this.