gdrcopy icon indicating copy to clipboard operation
gdrcopy copied to clipboard

cudaMalloc can no longer guarantee to return 64kB aligned address

Open e-ago opened this issue 5 years ago • 3 comments

GDRDRV needs 64kB aligned addresses.

gdrdrv_pin_buffer() {
...
    page_virt_start  = params.addr & GPU_PAGE_MASK;
    page_virt_end    = params.addr + params.size - 1;
    rounded_size     = page_virt_end - page_virt_start + 1;
    mr->offset       = params.addr & GPU_PAGE_OFFSET;
...
}

and

gdrdrv_mmap() {
...
    if (mr->offset) {
        gdr_dbg("offset != 0 is not supported\n");
        ret = -EINVAL;
        goto out;
    }
...
}

This is no more guaranteed with the cudaMalloc in recent CUDA drivers (since 410). A temporary WAR could be (at application level) to allocate with the cudaMalloc a memory area that is size + GPU_PAGE_SIZE and then search for the first 64kB aligned address. Something like:

alloc_size = (buffer_size + GPU_PAGE_SIZE) & GPU_PAGE_MASK;
cuMemAlloc(&dev_addr, alloc_size);
if(dev_addr % GPU_PAGE_SIZE) {
    dev_addr += (GPU_PAGE_SIZE - (dev_addr % GPU_PAGE_SIZE));
}

e-ago avatar May 09 '19 13:05 e-ago

I also encounter this bug.

@drossetti Can we remove the offset checking in gdrdrv_mmap()? To help users, we can also add a flag to gdr_map() such that gdr_map() automatically applies the offset (if any) before returning ptr_va.

pakmarkthub avatar Jun 17 '19 17:06 pakmarkthub

@e-ago what if the user of gdrcopy were instead in charge of aligning the start address to the proper page boundary, and to properly handle the offset ?

drossetti avatar Sep 06 '19 18:09 drossetti

I don't think we are ready to attack this problem, so removing 2.1 tag

drossetti avatar Jun 09 '20 22:06 drossetti