cuda-api-wrappers
cuda-api-wrappers copied to clipboard
Full representation of launch configurations + launch-with-full-config support
Since CUDA 12, the driver finally supports a proper launch configuration object, with a bunch of flags and features:
CUresult cuLaunchKernelEx (const CUlaunchConfig* config, CUfunction f, void** kernelParams, void** extra )
with the launch config being:
typedef struct CUlaunchConfig_st {
CUlaunchAttribute * attrs
unsigned int blockDimX
unsigned int blockDimY
unsigned int blockDimZ
unsigned int gridDimX
unsigned int gridDimY
unsigned int gridDimZ
CUstream hStream
unsigned int numAttrs
unsigned int sharedMemBytes
} CUlaunchConfig;
Each attribute has an ID and a value in a union, and here is the current list of IDs:
CU_LAUNCH_ATTRIBUTE_IGNORE
CU_LAUNCH_ATTRIBUTE_ACCESS_POLICY_WINDOW
CU_LAUNCH_ATTRIBUTE_COOPERATIVE
CU_LAUNCH_ATTRIBUTE_SYNCHRONIZATION_POLICY
CU_LAUNCH_ATTRIBUTE_CLUSTER_DIMENSION
CU_LAUNCH_ATTRIBUTE_CLUSTER_SCHEDULING_POLICY_PREFERENCE
CU_LAUNCH_ATTRIBUTE_PROGRAMMATIC_STREAM_SERIALIZATION
CU_LAUNCH_ATTRIBUTE_PROGRAMMATIC_EVENT
CU_LAUNCH_ATTRIBUTE_PRIORITY
CU_LAUNCH_ATTRIBUTE_MEM_SYNC_DOMAIN_MAP
CU_LAUNCH_ATTRIBUTE_MEM_SYNC_DOMAIN
CU_LAUNCH_ATTRIBUTE_LAUNCH_COMPLETION_EVENT
some of these regard launch-related/scheduling-related events (which should be another missing-feature issue).