cuda-samples icon indicating copy to clipboard operation
cuda-samples copied to clipboard

Run busGrind -p 1 -u 0 -e 0 -d 1, the Concurrent Host/Device Bandwidth Matrix result is confusing.

Open Rainzhouzzz opened this issue 9 months ago • 1 comments

run busGrind -p 1 -u 0 -e 0 -d 1, I got, .......



Test Description: Bus bandwidth between the host and a single device


Host/Device Bandwidth Matrix (GB/s), memory=Pinned Dir\D 0 1 2 3 4 5 6 7 D2H 56.83 57.12 57.14 57.15 55.37 55.47 55.49 53.53 H2D 56.17 56.21 56.21 56.20 56.17 56.17 56.12 56.14 BiDir 101.26 101.38 101.40 101.37 89.20 43.62 7.25 9.71





Test Description: Bus bandwidth between the host and multiple devices concurrently


Concurrent Host/Device Bandwidth Matrix (GB/s), memory=Pinned Dir\D 0 1 2 3 4 5 6 7 Total H2D 44.61 44.80 43.68 43.69 25.32 25.49 25.45 25.47 278.51 D2H 15.94 15.91 15.80 15.84 11.57 11.58 11.58 11.60 109.83 BiDir 22.55 22.76 22.62 22.64 10.04 4.40 17.99 18.03 141.02



As we can see, the BiDir result is confusing. why device 5 and device 7 have such low bandwidth? Is this expected?

BusGrind is a cuda demo suite tool. This might not be the right place to ask, but I couldn’t find the demo suite's repository.

Looking forward to your response.

Rainzhouzzz avatar Mar 27 '25 11:03 Rainzhouzzz

For this kind of question we will refer you to ask your question in the NVIDIA Developer Forums https://forums.developer.nvidia.com/ . Thanks.

jnbntz avatar Apr 18 '25 18:04 jnbntz