compute icon indicating copy to clipboard operation
compute copied to clipboard

Correct compute::context::get_devices method

Open JablonskiMateusz opened this issue 4 years ago • 10 comments

get a vector of cl_device_id and then use the device ids to populate a vector of compute::device

Signed-off-by: Mateusz Jablonski [email protected]

JablonskiMateusz avatar Jun 24 '20 12:06 JablonskiMateusz

Coverage Status

Coverage increased (+0.008%) to 84.027% when pulling 881ca4231de0b85b986b3e6a19949edcdf6340ec on JablonskiMateusz:master into 36c89134d4013b2e5e45bc55656a18bd6141995a on boostorg:master.

coveralls avatar Jun 24 '20 16:06 coveralls

@jszuppe could you look at this PR?

JablonskiMateusz avatar Jul 08 '20 17:07 JablonskiMateusz

Hey @JablonskiMateusz! Could you provide some detail on the motivation for this change? Was this causing issues with your OpenCL library?

For context, device is a very thin wrapper on cl_device_id, and they have identical memory layouts. There should be no need to first populate vector<cl_device_id> and then copy into vector<device>.

kylelutz avatar Jul 08 '20 17:07 kylelutz

Hi @kylelutz, the problem is scenario with sub devices. The current implementation is that boost queries number of context devices, then creates a vector of compute::device (default constructor of compute::device is called, without calling clRetainDevice), then it passes the pointer to data of the vector to clGetContextInfo and it fills the devices (also without calling clRetainDevice) and the vector is passed to the user's app. When the app destroys the vector, then the destructor of compute::device checks if it is a sub device and it calls clReleaseDevice. This causes that boost calls clReleaseDevice without calling clRetainDevice before. According to OpenCL spec, it caues UB:

After the device reference count becomes zero and all the objects attached to device (such as
command-queues) are released, the device object is deleted. Using this function to release a
reference that was not obtained by creating the object or by calling clRetainDevice causes
undefined behavior.

IMHO the cl_device_id should be passed to compute::device to its constructor or to assignment operator, currently it is performed in a quite hacky way like memcpy to the memory of compute::device .

My change was inspired by compute::program::get_devices where the vector of compute::device is populated from vector of cl_device_id https://github.com/boostorg/compute/blob/master/include/boost/compute/program.hpp#L191

JablonskiMateusz avatar Jul 09 '20 08:07 JablonskiMateusz

@kylelutz @jszuppe any comments?

JablonskiMateusz avatar Jul 14 '20 06:07 JablonskiMateusz

I think that in a real application the compute::context::get_devices should not be the critical path, so having a more correct version does not seem like a problem.

keryell avatar Jul 14 '20 18:07 keryell

@kylelutz ping

JablonskiMateusz avatar Jul 27 '20 06:07 JablonskiMateusz

@kylelutz @jszuppe ping

JablonskiMateusz avatar Aug 20 '20 15:08 JablonskiMateusz

kindly reminder @kylelutz

JablonskiMateusz avatar Oct 07 '20 10:10 JablonskiMateusz

@kylelutz @jszuppe could you merge this PR?

JablonskiMateusz avatar Nov 09 '20 13:11 JablonskiMateusz