pyopencl icon indicating copy to clipboard operation
pyopencl copied to clipboard

`Array.events` spuriously synchronizes on read-after-read

Open inducer opened this issue 4 years ago • 6 comments

  • [ ] There should be Array.write_events and Array.read_events, with Array.events aliasing write_events under a deprecation.
  • [ ] Arguments to all the generated kernels (elementwise etc.) should declare whether they're writing and automatically add to the appropriate events list.
  • [ ] Audit code calling into the kernels to avoid duplicate event insertion.

cc @alexfikl @zachjweiner

inducer avatar Oct 21 '19 14:10 inducer

If I'm recalling correctly, I think the idea was that once we can distinguish between read- and write-synchronization points, there will be no need to enforce the order of two kernels based on an array that they both (only) read. For example, the operations y = 2 * x and y_h = 2 * x.get() do not need to be ordered, but the first would add an event to x.events which the call to get would wait for. In principle they could execute asynchronously, but Array's event tracking enforces this synchronization at the moment (because it has no way to know that neither execution modifies x).

Does this make sense?

zachjweiner avatar Oct 21 '19 15:10 zachjweiner

Thanks for explaining, I remember now. The current Array.events system needs to be redone to wait on reads before making in-place changes.

!91 adds synchronization for WAR in .get(), but the overall issue is more general, IMO.

This is really what gl-19 was meant to capture. (which we'll have this issue subsume)

inducer avatar Oct 21 '19 15:10 inducer

@zachjweiner Could you remind me which way this issue went? According to my experiments, I believe 2*x does not currently add the resulting event to x.events (as of a60b4ed), so there's is not currently spurious synchronization. But I could see an argument that there should be synchronization for 2*x, to ensure that nobody modifies x before 2*x is done reading. Was this the core of the issue?

inducer avatar Oct 29 '20 18:10 inducer

I believe 2*x does not currently add the resulting event to x.events

I find the same - it seems events are only appended to self in array.py and never other.

But I could see an argument that there should be synchronization for 2*x, to ensure that nobody modifies x before 2*x is done reading. Was this the core of the issue?

Yes, I believe so. And I definitely believe the argument.

Looking back at gl-18 I see we discussed this issue with regard to calls to Array.get. And it does seem that the MR did add the get event to Array.events.

PS - I think #302 can be (read: was intended to be) closed.

zachjweiner avatar Oct 29 '20 18:10 zachjweiner

@inducer I was looking a bit through the code to see how to best solve the second point here, by having the kernels be more explicit about inputs and outputs.

First, it seems like these are the kernels that need work (missed any?):

  • pyopencl.elementwise: ElementwiseKernel
  • pyopencl.reduction: ReductionKernel
  • pyopencl.scan: GenericScanKernel, GenericDebugScanKernel, LegacyScanKernelBase (?)

They all take an arguments parameter in __init__ and __call__ takes the arrays as positional arguments in *args. What would be a good way to mark these as input / output in a backwards compatible way?

  • Add an inputs (defaulting to arguments) and outputs (defaulting to arguments) to __init__? Not sure this can be done in a backwards compatible way though, since everything is a positional argument.
  • Add an outputs: List[int] to __init__ to mark which of the arguments in arguments are outputs?
  • We could mark pyopencl.tools.Argument with a flag and have any new code pass those in __init__(arguments)? Something like how ArrayArg seems to work in loopy. Not sure what to do about the case where arguments is a string though (deprecated?).
  • Some keyword arg in __call__ denoting which arrays in *args are inputs / outputs? Probably shouldn't mess around with call sites though, so I don't like this..
  • Introduce some new kernel classes with a better interface? Kind of annoying since the good names are already taken..
  • [YOUR IDEA HERE :heart:]

EDIT: Should have probably said read / write instead of input / output above, since that's clearer.

alexfikl avatar Oct 10 '22 17:10 alexfikl

  • We could mark pyopencl.tools.Argument with a flag and have any new code pass those in __init__(arguments)? Something like how ArrayArg seems to work in loopy. Not sure what to do about the case where arguments is a string though (deprecated?).

I think this is the sanest. For the string arguments, this could come from either a const or a __read_only__ (sp? It's a CL thing.)

inducer avatar Oct 16 '22 00:10 inducer