pyopencl icon indicating copy to clipboard operation
pyopencl copied to clipboard

Wrap clCreateCommandQueueWithProperties

Open inducer opened this issue 6 years ago • 10 comments

This would be the pattern to follow: https://github.com/inducer/pyopencl/blob/21b09e316b00765d9c1612d4ad6b078003939049/src/c_wrapper/command_queue.cpp#L42-L66

inducer avatar Aug 19 '17 19:08 inducer

I tried the below quick hack but I am getting INVALID_COMMAND_QUEUE later on. There is no error upon calling clCreateCommandQueue though. Also setting an invalid QUEUE_SIZE does result in INVALID_QUEUE_VALUE, for example on AMD creating a queue with 16MB size when 8MB is max.

https://github.com/glupescu/pyopencl/commit/5874ce02a2e28cc40c431d8e839590f06242b6cd#diff-93de4fb0b5fa13ed2f66448c780b3102

From stackoverflow https://stackoverflow.com/questions/45767759/how-to-set-device-side-queue-size-in-pyopencl/49957843#49957843

glupescu avatar Apr 21 '18 16:04 glupescu

Sorry, I don't have the spare cycles at this moment to investigate in detail. I've put this on my list for later in the summer.

inducer avatar Apr 23 '18 03:04 inducer

#240 adds support for this. I'd be happy to hear your feedback.

inducer avatar Aug 13 '18 20:08 inducer

Will definitely check this out soon - thanks for adding support on this.

glupescu avatar Aug 14 '18 11:08 glupescu

Kicking this slightly back to life: did you manage to get a device-side queue working through pyopencl ever? I have spent the better part of today trying to make this work, but the closest i've gotten is the queue being created and a "clEnqueueNDRangeKernel failed: INVALID_COMMAND_QUEUE" being thrown at me when I try to enqueue a dumb kernel (that does nothing).

atypic avatar Jul 08 '20 15:07 atypic

What ICD (OpenCL driver) are you using?

inducer avatar Jul 08 '20 15:07 inducer

Edit: PEBCAK

The ranting below here is because i didn't understand that you can't enqueue to a device side queue from the host side. You need 2 queues. One on the host, one on the device. You can mark the device queue as default.

-- I've tried both the Nvidia(1.2) and intel (2.1) runtimes. The method complains about incompability when i use nvidia, of course.

Both using this way: cl.CommandQueue(self._cl_context, properties=cmcq.ON_DEVICE | cmcq.ON_DEVICE_DEFAULT | cmcq.OUT_OF_ORDER_EXEC_MODE_ENABLE)

and... cl.CommandQueue(self._cl_context, properties = [cmq.PROPERTIES, cmcq.ON_DEVICE | cmcq.ON_DEVICE_DEFAULT | cmcq.OUT_OF_ORDER_EXEC_MODE_ENABLE, cmq.SIZE, 1024]) leads to pyopencl._cl.LogicError: clEnqueueNDRangeKernel failed: INVALID_COMMAND_QUEUE

actually, I lie, on nVidia this leads to Segfault, though I have read that the ...withProperties() function is supported now.

Removing this and simply making an in-order on-host queue (default) the kernel runs fine...

atypic avatar Jul 09 '20 09:07 atypic

Thanks for following up! Just to be clear: Did you get things to work on Intel? (I'd expect that to work more than I'd epxect the same of Nvidia.)

inducer avatar Jul 09 '20 19:07 inducer

Eh!

It's complicated. So, I am for sure able to create on-device queues on both intel and nvidia platforms. I have made the following observations:

  • Using the ...withProperties()-call is required for doing this on nvidia. For intel I can use both calls and it works: but only on certain cards. My desktop has a 1660 and it doesn't work (OUT OF RESOURCES error), but the same code on a Tesla V100 works. I have an AMD card as well that throws "out of host memory" when I try to make the second queue using the withProperties() function, but I am able to use the 'normal' CreateCommandQueue().

  • I can enque_kernel() on both intel and nivida: BUT, on both platforms I get hangs if I do not turn off code caching. No idea why.

atypic avatar Jul 10 '20 11:07 atypic

Thanks for reporting back! Could you share some example code? I'd like to include that in the tests, if for no other reason than to make sure that the things that are working stay working.

inducer avatar Jul 10 '20 16:07 inducer