warp icon indicating copy to clipboard operation
warp copied to clipboard

[BUG] Occasional failure in `test_single_layer_nn_cpu`

Open shi-eric opened this issue 3 months ago • 1 comments

Bug Description

The pipeline from https://github.com/NVIDIA/warp/blob/main/.gitlab/ci/clang-build-and-test.yml has been occasionally failing due to:

======================================================================
test_single_layer_nn_cpu (warp.tests.tile.test_tile_mlp.TestTileMLP.test_single_layer_nn_cpu)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/omniverse/warp/warp/tests/unittest_utils.py", line 258, in test_func
    func(self, device, **kwargs)
  File "/builds/omniverse/warp/warp/tests/tile/test_tile_mlp.py", line 340, in test_single_layer_nn
    assert_np_equal(output.numpy(), output_np, tol=1.0e-2)
  File "/builds/omniverse/warp/warp/tests/unittest_utils.py", line 244, in assert_np_equal
    np.testing.assert_allclose(result.flatten(), expect.flatten(), atol=tol, equal_nan=True)
  File "/root/.cache/uv/archive-v0/SfYb6rYgA6cUqNMP-fI79/lib/python3.12/site-packages/numpy/testing/_private/utils.py", line 1718, in assert_allclose
    assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
  File "/root/.cache/uv/archive-v0/SfYb6rYgA6cUqNMP-fI79/lib/python3.12/site-packages/numpy/testing/_private/utils.py", line 926, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Not equal to tolerance rtol=1e-07, atol=0.01
Mismatched elements: 55 / 896 (6.14%)
Max absolute difference among violations: 0.22333258
Max relative difference among violations: 1.
 ACTUAL: array([0.000000e+00, 1.013524e-02, 0.000000e+00, 0.000000e+00,
       0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00,
       0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00,...
 DESIRED: array([0.000000e+00, 1.013524e-02, 0.000000e+00, 0.000000e+00,
       0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00,
       0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00,...
----------------------------------------------------------------------

System Information

No response

shi-eric avatar Oct 17 '25 06:10 shi-eric

Actually, this isn't limited to the https://github.com/NVIDIA/warp/blob/main/.gitlab/ci/clang-build-and-test.yml job.

We also see it on the ordinary (gcc-based) build-and-test jobs:

======================================================================
test_single_layer_nn_cpu (warp.tests.tile.test_tile_mlp.TestTileMLP.test_single_layer_nn_cpu)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/omniverse/warp/warp/tests/unittest_utils.py", line 258, in test_func
    func(self, device, **kwargs)
  File "/builds/omniverse/warp/warp/tests/tile/test_tile_mlp.py", line 340, in test_single_layer_nn
    assert_np_equal(output.numpy(), output_np, tol=1.0e-2)
  File "/builds/omniverse/warp/warp/tests/unittest_utils.py", line 244, in assert_np_equal
    np.testing.assert_allclose(result.flatten(), expect.flatten(), atol=tol, equal_nan=True)
  File "/builds/omniverse/warp/.venv/lib/python3.12/site-packages/numpy/testing/_private/utils.py", line 1718, in assert_allclose
    assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
  File "/builds/omniverse/warp/.venv/lib/python3.12/site-packages/numpy/testing/_private/utils.py", line 926, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Not equal to tolerance rtol=1e-07, atol=0.01
Mismatched elements: 55 / 896 (6.14%)
Max absolute difference among violations: 0.22333258
Max relative difference among violations: 1.
 ACTUAL: array([0.000000e+00, 1.013524e-02, 0.000000e+00, 0.000000e+00,
       0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00,
       0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00,...
 DESIRED: array([0.000000e+00, 1.013524e-02, 0.000000e+00, 0.000000e+00,
       0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00,
       0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00,...
----------------------------------------------------------------------

shi-eric avatar Oct 17 '25 16:10 shi-eric