boxtree icon indicating copy to clipboard operation
boxtree copied to clipboard

Transition to arraycontext

Open alexfikl opened this issue 2 years ago • 4 comments

Since this is a big breaking change, took the opportunity to remove some things that were marked as deprecated:

  • FMMTraversalInfo.colleagues_starts and FMMTraversalInfo.colleagues_lists
  • filter_target_lists_in_user_order and filter_target_lists_in_tree_order
  • Remove DeviceDataRecord.

TODO:

  • [x] ~Go over classes and see how many actually need to store an array context~.
    • This can be done in a follow-up PR without breaking compatibility.
  • [x] Rip out timing code
    • Actually replacing this requires a lot of work: needs to live in the array context.
  • [ ] Rip out manual event tracking
    • Needs https://github.com/inducer/pyopencl/issues/303
  • [x] Needs #71
  • [x] https://github.com/inducer/arraycontext/pull/177
  • [ ] https://github.com/inducer/sumpy/pull/139
  • [ ] https://github.com/inducer/pytential/pull/179
  • [x] distributed is not yet ported
    • [x] port distributed
    • [x] fix distributed without ImmutableHostDeviceArray.
  • [x] Point requirements.txt back to main

alexfikl avatar Jun 24 '22 16:06 alexfikl

As discussed, it seems to make sense to transition from DeviceDataRecord to arraycontext.dataclass_array_container. The operations needed are the same

  • DeviceDataRecord.get to arraycontext.to_numpy.
  • DeviceDataRecord.to_device to arraycontext.from_numpy.
  • DeviceDataRecord.with_queue to actx.freeze or actx.thaw (as necessary).
  • DeviceDataRecord.to_host_device_array to ??. Not sure about this one, but it seems like it should be deprecated and removed?

None of the classes based on DeviceDataRecord require any arithmetic, just to be serialized and deserialized.

alexfikl avatar Jun 24 '22 16:06 alexfikl

  • DeviceDataRecord.to_host_device_array to ??. Not sure about this one, but it seems like it should be deprecated and removed?

Yes, I think that's the right approach. The need for this arises mostly within the distributed-memory FMM, which @gaohao95 is working on. (@gaohao95, it would be useful if you could comment!) The point is that this needs tree information on the host to inform the realization of MPI communication, and it needs it on the device to support use of tree data in the translation operators. I think it will be cleaner to have two trees, one host-side, and one device-side, in place of the dual-storage solution at the array level that is currently in place.

An alternative would be to have an array context for these host-device arrays, but that seems perhaps more heavyweight than I'd like. A complicating factor is that we do not currently have a numpy array context (cf. https://github.com/inducer/arraycontext/pull/93).

inducer avatar Jun 24 '22 22:06 inducer

I think it will be cleaner to have two trees, one host-side, and one device-side, in place of the dual-storage solution at the array level that is currently in place.

For global trees, we are already maintaining two copies of trees, one in the device memory and another in the host memory. I think the only place that uses the ImmutableHostDeviceArray at the moment is the local tree returned by generate_local_tree, but it is only used for passing into a traversal builder, so for that we could just return the device array version.

An alternative would be to have an array context for these host-device arrays, but that seems perhaps more heavyweight than I'd like.

I agree we shouldn't do this. ImmutableHostDeviceArray was only a stopgap solution in the past. We should remove all to_host_device_array and ImmutableHostDeviceArray.

gaohao95 avatar Jun 26 '22 07:06 gaohao95

@inducer @gaohao95 Thanks for the feedback on that! I'll try removing ImmutableHostDeviceArray support and bug you once this gets into a better state.

alexfikl avatar Jun 26 '22 14:06 alexfikl