gdrcopy icon indicating copy to clipboard operation
gdrcopy copied to clipboard

How much the size of GDR can pin? Is there differences on Tesla and Quadro?

Open Notherthing opened this issue 1 year ago • 10 comments

When I use V100 , it shows that I can gdr_pin nearly all of the device memory (about 32GB). But when I use A4000, it can only pin about 220MB (the device of memory is about 16GB). Is there differences on Tesla and Quadro?

Notherthing avatar Jul 27 '24 18:07 Notherthing

Both of the two device' driver are: NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1.

Notherthing avatar Jul 27 '24 18:07 Notherthing

I have tried to disable the CPU PA 46 bits limitation in bios, but still could only pin the GPU memory less than 220MB.

Notherthing avatar Jul 27 '24 19:07 Notherthing

Here is the error log: 12 may means out of memory? Is there some information about this?

GPU id:0; name: NVIDIA RTX A4000; Bus id: 0000:51:00 selecting device 0 testing size: 231735296 rounded size: 231735296 gpu alloc fn: cuMemAlloc device ptr: 7f26b2000000 DBG: sse4_1=1 avx=1 sse=1 sse2=1 ERR: ioctl error (errno=12) pin ret: 12 ERR: mh is mapped already Assertion "(gdr_map(g, mh, &map_d_ptr, size)) == (0)" failed at copybw.cpp:81

Notherthing avatar Jul 27 '24 19:07 Notherthing

When I want to gdr_pin 221MB, it fails. And here is the information from dmesg. May this give more info for this question?

[64513.944169] gdrdrv:gdrdrv_open:minor=0 filep=0xff2796ccd17ff600 [64513.944176] gdrdrv:gdrdrv_ioctl:ioctl called (cmd 0xc008daff) [64513.944183] gdrdrv:gdrdrv_ioctl:ioctl called (cmd 0xc028da01) [64513.944295] gdrdrv:__gdrdrv_pin_buffer:invoking nvidia_p2p_get_pages(va=0x7f67b2000000 len=231735296 p2p_tok=0 va_tok=0 callback=ffffffffc06b2160) [64513.944297] gdrdrv:__gdrdrv_pin_buffer:nvidia_p2p_get_pages(va=7f67b2000000 len=231735296 p2p_token=0 va_space=0 callback=ffffffffc06b2160) failed [ret = -12] [64513.944298] gdrdrv:gdr_free_mr_unlocked:invoking unpin_buffer while callback has already been fired [64513.959911] gdrdrv:gdrdrv_release:closing

Notherthing avatar Jul 28 '24 12:07 Notherthing

I notice that V100 ‘s bar is about 32GB, but A4000 only has 256MB. I notice this bar. Does A4000 could get 16GB bar by compute mode? How could I switch the mode?

Notherthing avatar Jul 28 '24 14:07 Notherthing

Hi @Notherthing ,

As you have already figured out, the limitation is your GPU BAR size. This is the GPU HW characteristic. There is nothing much we can do here. You cannot map the entire GPU memory at once because of the small GPU BAR. But you can use a sliding window technique to map the region you want to use. When you need to access a different region, you free the current mapping first and then map the new region.

pakmarkthub avatar Jul 29 '24 01:07 pakmarkthub

Hi @Notherthing ,

As you have already figured out, the limitation is your GPU BAR size. This is the GPU HW characteristic. There is nothing much we can do here. You cannot map the entire GPU memory at once because of the small GPU BAR. But you can use a sliding window technique to map the region you want to use. When you need to access a different region, you free the current mapping first and then map the new region.

Thank you, my friend. It's pity to learn about that cheap Quadro device has small BAR size. I notice this displaymode is used to switch the GPU mode to have larger BAR size. But it doesn't mention A4000 (only A5000 and devices with higher specification). Does it will work for A4000? And thanks for your valuable advice sincerely. If we could not enlarge the BAR size, I think it is necessary to use special designs when using GDR.

Notherthing avatar Jul 29 '24 11:07 Notherthing

I am not sure what that script does. Because A4000 is not in the support list, I would not advise you to try it.

Generally, small BAR GPUs remain as small BAR. You may be able to disable the graphic mode using nvidia-smi, depending on your card. However, it would not change your total GPU BAR size. It might just remove the reserved BAR space for graphic. This is something you can experiment to squeeze out a few more MB.

pakmarkthub avatar Jul 31 '24 08:07 pakmarkthub

@Notherthing depending in your motherboard, you might also be able to get a larger BAR1 by taking advantage of the PCIe "Resizable BAR" feature. In practice the SBIOS would read a range of supported GPU BAR sizes (through a config register placed in a PCIe extension) and pick a reasonably large size.

drossetti avatar Jul 31 '24 18:07 drossetti

Thanks. I am going to try.

Notherthing avatar Aug 01 '24 13:08 Notherthing