cuda-api-wrappers
cuda-api-wrappers copied to clipboard
Bandwidth test example fails in copy kernels phase with "Invalid resource handles"
When running the p2pBandwidthLatencyTest, a modified CUDA sample program, we get:
... snip ...
Testing copy mechanism: Kernels...
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
D\D 0 1 2
0 0.00 0.00 0.00
1 0.00 0.00 0.00
2 0.00 0.00 0.00
terminate called after throwing an instance of 'cuda::runtime_error'
what(): kernel launch failed for kernel at 0x034a47d0 on stream 0x033d5e70 in context 0x01c871b0 on device 1: invalid resource handle
unused/run-all-examples: line 7: 35177 Aborted (core dumped) ./$f
This happens on a machine with multiple Turing GPUs which are P2P-capable (I think).