cubed
cubed copied to clipboard
Defer to merge_chunks in special cases of rechunk
#221 introduced merge_chunks, a special-case of rechunk that can be implemented using blockwise. I noticed that whilst reduction calls merge_chunks directly, inside ops.rechunk the primitive rechunk is always called. Shouldn't it be possible for ops.rechunk to check if the user is asking them to perform that special case, and internally dispatch to merge_chunks?
This also makes me wonder whether there are any other special cases of rechunk that could be written using blockwise.
This should be possible, but I'm not sure how much this occurs in practice. The only calls to rechunk in Cubed are in reshape and from_array (for some cases).
It will also happen if an xarray user calls .chunk on an already-chunked array, because it dispatches to cubed's rechunk method.
I agree it's not a very common case (though I expect it to come up in the full pangeo vorticity example where we pad then rechunk to merge the padded values back in).
there are any other special cases of rechunk that could be written using blockwise. It will also happen if an xarray user calls .chunk on an already-chunked array
This is very confusing to me.
Isn't a rechunk without a on-disk intermediate not "blockwise" by definition (you are communicating across chunks)? I thought the optimization was effectively optimizing chunking when reading from an intermediate store by looking at the chunks needed for the succeeding operation. But perhaps I'm misunderstanding something.
merge_chunks is implemented using map_direct, which
works by creating an empty array that has the same shape and chunk structure as the output, and calling map_blocks on this empty array, passing in the input arrays as side inputs to the function, which may access them in whatever way is needed.
That might resolve it for you @dcherian ?
Alternatively, the way I have been thinking about this merge_chunks operation is as just one-half of what rechunker does. In Tom A's original suggestion that led to rechunker, he breaks general rechunking into a split pass and a merge pass. If you can accomplish the specific rechunk only doing the merge pass you don't need the intermediate store.
(This also suggests that an equivalent split_chunks might also be possible to implement using map_direct)