pyopencl
pyopencl copied to clipboard
Add read_events and write_events to Array
Takes care of the first point from #303.
- Adds a
read_events
andwrite_events
toArray
. - Teaches all the functions in
cl.array
to use those and not wait on read events when they don't need to. - Teaches all the functions in
cl.clmath
to play nice as well - The rest of the code just does a straight port form
events -> write_events
. This still needs work.
TODO:
- [ ] Needs corresponding loopy change
@inducer This should be in a non-horrible state now. Can you take a quick look to see if it's doing what you had in mind?
From going through it, it seems like several things still need work
-
elwise_kernel_runner
could probably add towrite_events
for the first arg and toread_events
for the rest of them, so we don't clutter all the rest of the code. Does that sound reasonable? - It seems to me that something like an
C = AXPBY
would need wait forread_events
onC
andwrite_events
onA
andB
. Does that sound right? (+ similar logic for the other elementwise kernels)
elwise_kernel_runner
could probably add towrite_events
for the first arg and toread_events
for the rest of them, so we don't clutter all the rest of the code. Does that sound reasonable?
Does.
- It seems to me that something like an
C = AXPBY
would need wait forread_events
onC
andwrite_events
onA
andB
. Does that sound right? (+ similar logic for the other elementwise kernels)
The way I see it, it wouldn't have to wait on read_events
, because read-after-read does not need synchronization. Or am I missing something?
The way I see it, it wouldn't have to wait on
read_events
, because read-after-read does not need synchronization. Or am I missing something?
In that example, C
would be a write-after-read, right? That's why I was thinking that all the reads should finish before it gets modified.
It seems to me that something like an
C = AXPBY
would need wait forread_events
onC
andwrite_events
onA
andB
. Does that sound right? (+ similar logic for the other elementwise kernels)
C
would need to wait for read_events
and write_events
, unless you'd make the two subsets of each other.
@inducer This should be ready for a look. It could probably use some more tests, but I'm not quite sure what we can do there. Any ideas?
@inducer This is the macos error I was worried about https://github.com/inducer/pyopencl/actions/runs/3210604562/jobs/5248179538 but it seems to be happening on other branches as well, e.g. https://github.com/inducer/pyopencl/actions/runs/3254926391/jobs/5343705664 so it's likely no biggie (?). With that in mind, I feel better about this not introducing any cool new bugs :D