torch-ash icon indicating copy to clipboard operation
torch-ash copied to clipboard

question about environment

Open garylidd opened this issue 1 year ago • 3 comments

Hi, thanks for sharing your code. I can compile the ash module, but it fails the tests you provided. The error is always something like: stdgpu::vector::size : Size out of bounds: -14 not in [0, 111]. I guess it may be an environmental issue. My environment is CUDA 11.7 + pytorch 1.12.1 + python 3.8. Hope you can provide some information about your development environment.

garylidd avatar Aug 23 '23 09:08 garylidd

Would you please specify which test did it fail? This is a known issue to stdgpu with a large amount of negative values and I have a plan to investigate it. But it should have at least passed the test though...

theNded avatar Aug 28 '23 02:08 theNded

Thanks for your reply. I post the complete test log below for your reference.

platform linux -- Python 3.8.17, pytest-7.4.0, pluggy-1.2.0 rootdir: /home/administrator/Documents/torch-ash plugins: torchtyping-0.1.4, dash-2.11.1, typeguard-4.0.0, anyio-3.7.1 collected 27 items

unittests/test_embedding.py .... [ 14%] unittests/test_engine.py ....... [ 40%] unittests/test_hashmap.py ..... [ 59%] unittests/test_hashset.py ..F. [ 74%] unittests/test_sparsedense_grid.py .....F. [100%]

=============================================================================================================================================== FAILURES ================================================================================================================================================ ________________________________________________________________________________________________________________________________________ TestHashSet.test_resize ________________________________________________________________________________________________________________________________________

self = <unittests.test_hashset.TestHashSet object at 0x7f5a09c9c0a0>

def test_resize(self):
    self._resize_block(1, 100, 100, 1000)
  self._resize_block(1, 1000, 1000, 10000)

unittests/test_hashset.py:95:


self = <unittests.test_hashset.TestHashSet object at 0x7f5a09c9c0a0>, dim = 1, num = 1000, old_capacity = 1000, new_capacity = 10000

def _resize_block(self, dim, num, old_capacity, new_capacity):
    hashset = HashSet(dim, old_capacity, self.device)

    keys = self._generate_keys(dim, num)
    hashset.insert(keys)

    hashset.resize(new_capacity)

    masks = hashset.find(keys)
  assert masks.sum() == num

E AssertionError: assert tensor(996, device='cuda:0') == 1000 E + where tensor(996, device='cuda:0') = <built-in method sum of Tensor object at 0x7f5a09c84db0>() E + where <built-in method sum of Tensor object at 0x7f5a09c84db0> = tensor([ True, True, True, True, True, True, True, True, True, True,\n True, True, True, True, Tru...e, True, True,\n True, True, True, True, True, True, True, True, True, True],\n device='cuda:0').sum

unittests/test_hashset.py:52: AssertionError ----------------------------------------------------------------------------------------------------------------------------------------- Captured stdout call ------------------------------------------------------------------------------------------------------------------------------------------ [c++] insert_keys 100 [c++] insert_keys 100 [c++] insert_keys 1000 stdgpu::vector::pop_back : Index out of bounds: -1 not in [0, 414] stdgpu::vector::size : Size out of bounds: -1 not in [0, 415]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -1 not in [0, 415]. Clamping to 0 stdgpu::vector::pop_back : Object empty unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full stdgpu::vector::size : Size out of bounds: -1 not in [0, 415]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -1 not in [0, 415]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -1 not in [0, 415]. Clamping to 0 stdgpu::vector::pop_back : Object empty unordered_base::try_insert : Associated bucket and excess list full stdgpu::vector::size : Size out of bounds: -1 not in [0, 415]. Clamping to 0 [c++] insert_keys 996 ___________________________________________________________________________________________________________________________________ TestSparseDenseGrid.test_forward ____________________________________________________________________________________________________________________________________

self = <unittests.test_sparsedense_grid.TestSparseDenseGrid object at 0x7f5a09c9ce80>

def test_forward(self):
  self._forward_block(in_dim=3, embedding_dim=3, grid_dim=1)

unittests/test_sparsedense_grid.py:379:


self = <unittests.test_sparsedense_grid.TestSparseDenseGrid object at 0x7f5a09c9ce80>, in_dim = 3, embedding_dim = 3, grid_dim = 1, bound = 3

def _forward_block(self, in_dim, embedding_dim, grid_dim, bound=3):
    grid = SparseDenseGrid(
        in_dim=in_dim,
        num_embeddings=self.capacity,
        embedding_dim=embedding_dim,
        grid_dim=grid_dim,
        device=self.device,
    )

    grid_coord_range = torch.arange(
        -bound, bound + 1, 1, dtype=torch.int, device=self.device
    )

    # Create a dense grid to test correctness
    grid_coords = grid_dim * torch.stack(
        torch.meshgrid(
            grid_coord_range, grid_coord_range, grid_coord_range, indexing="ij"
        ),
        dim=-1,
    ).view(-1, 3)

    grid.spatial_init_(grid_coords, dilation=0)
    grid_coords, cell_coords, grid_indices, cell_indices = grid.items()
    coords = grid_coords * grid_dim + cell_coords
    coords = coords.view(-1, in_dim).float()

    with torch.no_grad():
        grid.embeddings[grid_indices, cell_indices, :3] = coords.view(
            grid_indices.shape[0], cell_indices.shape[1], 3
        )

    # Map query to [min, max - 1) to check purely in-bound queries
    num_queries = 1000
    query_cell_coords = torch.rand(num_queries, in_dim, device=self.device)
    # min: -grid_dim * bound
    # max: grid_dim * bound - 1
    query_cell_coords = (
        2 * grid_dim * bound - 1
    ) * query_cell_coords - grid_dim * bound

    embeddings, masks = grid(query_cell_coords, interpolation="linear")
  assert torch.allclose(embeddings[..., :3], query_cell_coords)

E AssertionError: assert False E + where False = <built-in method allclose of type object at 0x7f5a9a7311c0>(tensor([[ 1.0372, -0.0440, 0.1211],\n [ 0.0000, 0.0000, 0.0000],\n [ 0.0000, 0.0000, 0.0000],\n ... [ 0.0000, 0.0000, 0.0000],\n [ 0.0000, 0.0000, 0.0000]], device='cuda:0',\n grad_fn=<AliasBackward0>), tensor([[ 1.0372, -0.0440, 0.1211],\n [ 1.9756, 0.9248, 1.6209],\n [ 1.5990, -2.0546, -2.2807],\n ...-0.2605, -1.4161, 0.9210],\n [-1.8164, -2.4182, 0.5889],\n [-0.1101, -0.8764, -2.6821]], device='cuda:0')) E + where <built-in method allclose of type object at 0x7f5a9a7311c0> = torch.allclose

unittests/test_sparsedense_grid.py:180: AssertionError ----------------------------------------------------------------------------------------------------------------------------------------- Captured stdout call ------------------------------------------------------------------------------------------------------------------------------------------ [c++] insert_keys 343 stdgpu::vector::pop_back : Index out of bounds: -1 not in [0, 110] stdgpu::vector::pop_back : Index out of bounds: -2 not in [0, 110] stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty stdgpu::vector::pop_back : Object empty unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full unordered_base::try_insert : Associated bucket and excess list full stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 stdgpu::vector::size : Size out of bounds: -2 not in [0, 111]. Clamping to 0 [c++] insert_keys 255 =========================================================================================================================================== warnings summary ============================================================================================================================================ ../../anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/nn/modules/module.py:1365 unittests/test_embedding.py::TestEmbedding::test_backward unittests/test_engine.py::TestEngine::test_state_dict unittests/test_hashmap.py::TestMap::test_state_dict /home/administrator/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/nn/modules/module.py:1365: UserWarning: Positional args are being deprecated, use kwargs instead. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details. warnings.warn(

unittests/test_sparsedense_grid.py::TestSparseDenseGrid::test_items unittests/test_sparsedense_grid.py::TestSparseDenseGrid::test_query unittests/test_sparsedense_grid.py::TestSparseDenseGrid::test_forward unittests/test_sparsedense_grid.py::TestSparseDenseGrid::test_backward /home/administrator/Documents/torch-ash/ash/core.py:145: UserWarning: keys are not int32, conversion might reduce precision. warnings.warn("keys are not int32, conversion might reduce precision.")

unittests/test_sparsedense_grid.py::TestSparseDenseGrid::test_backward /home/administrator/anaconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/autograd/gradcheck.py:652: UserWarning: Input #0 requires gradient and is not a double precision floating point or complex. This check will likely fail if all the inputs are not of double precision floating point or complex. warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ======================================================================================================================================== short test summary info ======================================================================================================================================== FAILED unittests/test_hashset.py::TestHashSet::test_resize - AssertionError: assert tensor(996, device='cuda:0') == 1000 FAILED unittests/test_sparsedense_grid.py::TestSparseDenseGrid::test_forward - AssertionError: assert False =============================================================================================================================== 2 failed, 25 passed, 9 warnings in 43.19s ===============================================================================================================================

garylidd avatar Aug 28 '23 02:08 garylidd

Thanks, this is very helpful. I will try to identify the problem.

theNded avatar Aug 28 '23 03:08 theNded