numcodecs icon indicating copy to clipboard operation
numcodecs copied to clipboard

Supporting GDeflate in numcodecs

Open akshaysubr opened this issue 1 year ago • 2 comments

Problem description

GDeflate is a new compression format that is designed to match zlib/gzip/deflate compression ratio but decompress very fast on the GPU (see some benchmarks here). This is currently a standard in Microsoft DirectCompute for gaming applications, but it would be very useful for zarr as well since the ideal use case for this would be compress once, decompress multiple times. Here is the GDeflate spec that is released: https://github.com/microsoft/DirectStorage/blob/main/GDeflate/GDeflate/README.md

The GPU codecs for GDeflate are currently being shipped through kvikIO. But this is only for GPU compression and decompression. This essentially means that compressing on the GPU would mean you need a GPU to decompress as well. We do have CPU routines for GDeflate as well and it would be good to wrap those and expose them through numcodecs too so there is a CPU fallback.

The main question is where should this CPU GDeflate codec go? It can also be packaged in kvikIO, but given that kvikIO is a GPU focused library, it might not be the most natural place for a CPU codec to go. Would it be acceptable to add that codec to numcodecs directly? Would appreciate some guidance on this.

akshaysubr avatar Dec 07 '23 07:12 akshaysubr

I would love to include GDeflate directly in numcodecs. 🙌

rabernat avatar Dec 07 '23 12:12 rabernat

@rabernat Thanks, I'll work on a PR for it then. Might have some questions about packaging which I'll post here if that's okay.

akshaysubr avatar Dec 11 '23 17:12 akshaysubr