ompi icon indicating copy to clipboard operation
ompi copied to clipboard

v4.1.x: opal/cuda: avoid direct access to cumem host numa memory

Open Akshay-Venkatesh opened this issue 6 months ago • 8 comments

Memory allocated using cumemcreate API with location as {CU_MEM_LOCATION_TYPE_HOST/CU_MEM_LOCATION_TYPE_HOST_NUMA/CU_MEM_LOCATION_TYPE_HOST _NUMA_CURRENT} can be detected as host memory type by pointer query API but this doesn't allow the CPU to access such memory using memcpy or other CPU load/store mechanisms unless explicitly requested with cuMemSetAccess. Without the changes in this PR, HOST_NUMA backed cumemcreate memory is detected as host by openmpi layers (opal/datatype, ompi/coll) and subsequent accesses by CPU thread leads to illegal access errors.

bot:notacherrypick

Akshay-Venkatesh avatar Aug 13 '24 17:08 Akshay-Venkatesh