cubed icon indicating copy to clipboard operation
cubed copied to clipboard

Introduce a class to model parts of an operation's memory usage

Open tomwhite opened this issue 11 months ago • 0 comments

Consider a simple blockwise operation with one input, where each task carries out the following steps:

  1. read compressed Zarr chunk
  2. decompress Zarr chunk to produce the input array
  3. apply the operation to produce the output array (which may or may not be a different instance from 2.)
  4. write compressed Zarr chunk

We currently model this as using four times the size of the chunk (see explanation here), which is supported by examining the memory usage in tools like Fil.

If there are multiple inputs then things get more complicated depending on how the operation allocates memory, and if earlier inputs are freed before later ones are read. But there are a fairly small number of categories of operation, so it should be possible to model what they do.

The point of modelling this would be to work out the projected memory usage for operations that have been fused. For example, fusing two operations means that the intermediate Zarr file is not written, so these parts of the memory usage can be dropped.

tomwhite avatar Aug 01 '23 14:08 tomwhite