cosima-cookbook icon indicating copy to clipboard operation
cosima-cookbook copied to clipboard

`cookbook` with `dask` client gives out _a lot of_ warnings about bad chunking

Open navidcy opened this issue 1 year ago • 9 comments

I've been getting these for a while. I think we shouldn't ignore these.

I managed to pin it down to the warnings appearing only when a dask client is present. See

https://gist.github.com/navidcy/2fd2a4464d8cf06b25e1b2130c3cfbe9

navidcy avatar Feb 27 '24 11:02 navidcy

Discussion in https://github.com/COSIMA/cosima-recipes/pull/305 might be relevant.

cc @angus-g, @dougiesquire

navidcy avatar Feb 27 '24 11:02 navidcy

I think this is the same issue as #333

And per https://github.com/COSIMA/cosima-recipes/pull/305#issuecomment-1776394141, the suggestion is to remove the chunking that the cookbook is doing.

anton-seaice avatar Feb 27 '24 22:02 anton-seaice

And per https://github.com/COSIMA/cosima-recipes/pull/305#issuecomment-1776394141, the suggestion is to remove the chunking that the cookbook is doing.

Yup. Xarray v2023.09.0 introduced some changes to the way the chunks argument to open_dataset is handled. As part of this, a warning was added when the requested dask chunks divides the netcdf chunks. The cosima-cookbook getvar function internally determines the netcdf chunking of first variable in the file and opens the file using that chunking. If there are other variables in the file with larger chunks than the requested variable, their chunks will be divided and the warning will be thrown.

As I said in https://github.com/COSIMA/cosima-recipes/pull/305#issuecomment-1776394141, the best fix is probably to change the cosima-cookbook to open the files with chunks={} rather than the chunking of the first variable. This will then use each variable's netcdf chunking.

Note that all that's changed is that the warning has been added to xarray. This was always happening in the cosima-cookbook.

dougiesquire avatar Feb 27 '24 23:02 dougiesquire

oops, yes this may be a duplicate or related to #333!

@dougiesquire yeap thanks for re-iterating. I just wanted this in the cookbook repo.

So I understand that the only change we got is that the warnings appeared. But I presume, that the folks at xarray/dask decided to add the warnings for a reason. Therefore, from what I understand, we could either:

  • take these warnings into consideration because we agree they are important and do something about it or
  • point out to xarray devs that we have a case in which these warnings are not relevant and showcase an example via an issue in their repo or
  • decide that the warnings are relevant but still we are doing the best we can anyway... in that case do something so that the warnings don't show up to every cookbook user or
  • something else?

navidcy avatar Feb 28 '24 06:02 navidcy

point out to xarray devs that we have a case in which these warnings are not relevant and showcase an example via an issue in their repo

I think the warnings are doing what they're supposed to here.

If you want to, you could ignore the warnings with something like:

import warnings
warnings.filterwarnings("ignore", message="The specified chunks separate the stored chunks along dimension")

Probably the best option is to change the cosima-cookbook as described above, but afaik there is no one maintaining the cookbook.

dougiesquire avatar Feb 28 '24 22:02 dougiesquire

  • something else?

Use the intake-catalogue :)

anton-seaice avatar Feb 28 '24 22:02 anton-seaice

@angus-g is this resolved with #341?

navidcy avatar Aug 06 '24 00:08 navidcy

  • something else?

Use the intake-catalogue :)

cheeky :)

navidcy avatar Aug 06 '24 00:08 navidcy

@angus-g is this resolved with #341?

No

angus-g avatar Aug 06 '24 00:08 angus-g