boxtree
boxtree copied to clipboard
Transition to arraycontext
Since this is a big breaking change, took the opportunity to remove some things that were marked as deprecated:
-
FMMTraversalInfo.colleagues_starts
andFMMTraversalInfo.colleagues_lists
-
filter_target_lists_in_user_order
andfilter_target_lists_in_tree_order
- Remove
DeviceDataRecord
.
TODO:
- [x] ~Go over classes and see how many actually need to store an array context~.
- This can be done in a follow-up PR without breaking compatibility.
- [x] Rip out timing code
- Actually replacing this requires a lot of work: needs to live in the array context.
- [ ] Rip out manual event tracking
- Needs https://github.com/inducer/pyopencl/issues/303
- [x] Needs #71
- [x] https://github.com/inducer/arraycontext/pull/177
- [ ] https://github.com/inducer/sumpy/pull/139
- [ ] https://github.com/inducer/pytential/pull/179
- [x]
distributed
is not yet ported- [x] port
distributed
- [x] fix
distributed
withoutImmutableHostDeviceArray
.
- [x] port
- [x] Point
requirements.txt
back to main
As discussed, it seems to make sense to transition from DeviceDataRecord
to arraycontext.dataclass_array_container
. The operations needed are the same
-
DeviceDataRecord.get
toarraycontext.to_numpy
. -
DeviceDataRecord.to_device
toarraycontext.from_numpy
. -
DeviceDataRecord.with_queue
toactx.freeze
oractx.thaw
(as necessary). -
DeviceDataRecord.to_host_device_array
to??
. Not sure about this one, but it seems like it should be deprecated and removed?
None of the classes based on DeviceDataRecord
require any arithmetic, just to be serialized and deserialized.
DeviceDataRecord.to_host_device_array
to??
. Not sure about this one, but it seems like it should be deprecated and removed?
Yes, I think that's the right approach. The need for this arises mostly within the distributed-memory FMM, which @gaohao95 is working on. (@gaohao95, it would be useful if you could comment!) The point is that this needs tree information on the host to inform the realization of MPI communication, and it needs it on the device to support use of tree data in the translation operators. I think it will be cleaner to have two trees, one host-side, and one device-side, in place of the dual-storage solution at the array level that is currently in place.
An alternative would be to have an array context for these host-device arrays, but that seems perhaps more heavyweight than I'd like. A complicating factor is that we do not currently have a numpy array context (cf. https://github.com/inducer/arraycontext/pull/93).
I think it will be cleaner to have two trees, one host-side, and one device-side, in place of the dual-storage solution at the array level that is currently in place.
For global trees, we are already maintaining two copies of trees, one in the device memory and another in the host memory. I think the only place that uses the ImmutableHostDeviceArray
at the moment is the local tree returned by generate_local_tree
, but it is only used for passing into a traversal builder, so for that we could just return the device array version.
An alternative would be to have an array context for these host-device arrays, but that seems perhaps more heavyweight than I'd like.
I agree we shouldn't do this. ImmutableHostDeviceArray
was only a stopgap solution in the past. We should remove all to_host_device_array
and ImmutableHostDeviceArray
.
@inducer @gaohao95 Thanks for the feedback on that! I'll try removing ImmutableHostDeviceArray
support and bug you once this gets into a better state.