zarr-python icon indicating copy to clipboard operation
zarr-python copied to clipboard

Add Codec unit tests

Open rabernat opened this issue 1 year ago • 3 comments

While working on #2031 I became familiar with the new V3 Codec API and its peculiarities. And I saw that we don't yet have actual unit tests for the codecs. We have some tests in tests/v3/test_codecs/, but I'd call these more end-to-end tests, since they are creating Arrays.

I think it's important for us to unit-test al of the important internal interfaces separately from end-to-end tests. This is particularly important for codecs, so we can guard against data corruption issues.

This PR is a step in that direction.

TODO:

  • [ ] Add decode_partial and encode_partial tests.
  • [ ] Parametrize more variation of input data
  • [ ] Look for opportunities to make these tests simpler / faster (right now there is a combinatorial explosion of possibilities)

rabernat avatar Jul 13 '24 21:07 rabernat

One area of feedback on the Codec API: it makes very little sense to me that the Codec API is async. Almost by definition Codecs are blocking, CPU-intensive code. They are not doing I/O. Why should their core methods be async?

It should be the Pipeline's job to dispatch blocking Codecs calls to threads. Not the Codec itself.

rabernat avatar Jul 14 '24 17:07 rabernat

One area of feedback on the Codec API: it makes very little sense to me that the Codec API is async. Almost by definition Codecs are blocking, CPU-intensive code. They are not doing I/O. Why should their core methods be async?

It should be the Pipeline's job to dispatch blocking Codecs calls to threads. Not the Codec itself.

The reason why all codecs need to be async is because sharding is a codec, and the encode / decode operation of the sharding codec requires doing IO.

I would love to see some formal separation between "codecs that read from storage" (i.e., just sharding) and "codecs that transform bytes in memory" (all the other codecs), but I'm not sure what this would look like.

d-v-b avatar Jul 20 '24 20:07 d-v-b

oops, I approved without noting that this is a draft. sorry for the noise. consider it a draft approval.

d-v-b avatar Oct 10 '24 19:10 d-v-b