"gdrcopy_sanity" failing with Nvidia 560 drivers on A100
OS: Ubuntu 20.04 Kernel: 5.15.0-1071 Nv drivers: 560.35.03 CUDA: 12.5 gdrcopy: 2.4.1
error:
gdrcopy_sanity
Assertion "(gdr_pin_buffer(g, d_A[0], buffer_size, 0, 0, &A_mh[0])) == (0)" failed at sanity.cpp:435
Assertion "(gdr_pin_buffer(g, d_A, A_size, 0, 0, &A_mh)) == (0)" failed at sanity.cpp:344
Total: 28, Passed: 24, Failed: 2, Waived: 2
List of failed tests:
basic_small_buffers_mapping
basic_unaligned_mapping
List of waived tests:
invalidation_access_after_free_cumemalloc
invalidation_access_after_free_vmmalloc
Error: Encountered an error or a test failure with status=1
Hi @rafsalas19,
This looks like an issue we have already fixed in the master branch. May I ask you to try it out? You will need to compile and install the gdrdrv driver to get this fix.
Ok thanks let me try
Ok this worked Thanks! Can you let me know when will there be a release/tag published that has this fix in it?
@pakmarkthub Following up on this. We only pull releases to include on our systems. When can we expect a release to include this, it appears, 7 month old fix?
This is currently blocking us from adopting new NVIDIA Telsa drivers, and thus new CUDA runtimes as well
@LiquidPT FYI we recently tagged rel 2.4.2.