OpenCL.jl
OpenCL.jl copied to clipboard
Integrate OpenCL <-> Julia Event systems
We should have better integration with Julia's underlying event / task system.
My intuition would be that with http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clSetEventCallback.html we should be able to build either RemoteReferences or Tasks that completes when the underlying events complete.
Also of interest might be something like https://github.com/shashi/React.jl
Something along those lines might be helpful. More information here: http://julialang.org/blog/2013/05/callback/
using OpenCL; const cl = OpenCL
type Event
status :: Symbol
error_code :: Int
Event() = Event(:undefined)
Event(status :: Symbol) = new(status, 0)
end
device = first(cl.devices())
ctx = cl.Context(device)
q = cl.CmdQueue(ctx)
test = Event()
function callback(event, event_status, data :: Ptr{Void})
event_data = unsafe_pointer_to_objref(data) :: Event
error_code = 0
if event_status == cl.CL_COMPLETE
status = :complete
elseif event_status == cl.CL_SUBMITTED
status = :submitted
elseif event_status == cl.CL_RUNNING
status = :submitted
elseif event_status == cl.CL_QUEUED
status = :queued
elseif event_status < 0
status = :error
error_code = event_status
end
event_data.status = status
event_data.error_code = error_code
return nothing
end
const callback_c = cfunction(callback, Void, (cl.CL_event, cl.CL_int, Ptr{Void}))
usr_evt = cl.UserEvent(ctx)
cl.enqueue_wait_for_events(q, usr_evt)
mkr_evt = cl.enqueue_marker(q)
cl.api.clSetEventCallback(mkr_evt.id, cl.CL_COMPLETE, callback_c, test)
println(test)
cl.complete(usr_evt)
sleep(1.0)
println(test)
@vchuravy that is a good sketch of what we will have to do (what you are missing is a global ObjectID
Dict to hold the references to the underlying Julia objects you are hiding from the GC). We also need a Condition
object as a field of the Event
type. This is how you signal to Julia's event loop that something has changed and that dependent code needs to be scheduled to be executed (grep Condition
to see how it is used in base). Making this more difficult is the fact that callbacks can be executed asynchronously by the OpenCL runtime so we have to factor in thread safety ( http://docs.julialang.org/en/latest/manual/calling-c-and-fortran-code/#thread-safety). I don't know what the performance hit of this is so this is an abstraction that probably needs to be opted into instead of being the default.
With CPU OpenCL implementations (and Intel's PHI) the Task parallel model that OpenCL provides is a great way to execute multiple different kernels concurrently. It is an underutilized part of the spec because it makes no sense for GPU's (currently).