Unable to append to a very large zarr store
Zarr version
v2.16.1
Numcodecs version
v0.12.0
Python Version
3.11.6
Operating System
Linux
Installation
using conda
Description
I have a very large (~1 Pb) zarr store on Google Cloud Storage here, containing output from a reanalysis (weather model fields like air temperature, wind velocity, etc). I am using xarray and generally following this guidance to append to the zarr store along the time dimension. The existing dataset has ~30 years of data at 3 hour frequency and 1/4 degree resolution, and I just want to append ~two more months of data, so appending a relatively small amount of data to a huge existing zarr store.
I tried this locally with a small example and it was fine (given as an example below), but I can't seem to do it with this large dataset. I'm executing the initial append step (highlighted as problematic down below) from a c2-standard-60 node (240 GB RAM) on google cloud, but it never properly completes the task, and the node eventually becomes unresponsive. Any tips on how to do something like this would be very helpful, and please let me know if I should post this somewhere else. Thanks in advance!
Steps to reproduce
import xarray as xr
# this is the existing dataset
ds = xr.open_zarr(
"gcs://noaa-ufs-gefsv13replay/ufs-hr1/0.25-degree/03h-freq/zarr/fv3.zarr",
storage_options={"token": "anon"},
)
# grab just 2 time stamps of the data, store locally
# this is an example to mimic the existing dataset on GCS
ds[["tmp"]].isel(time=slice(2)).to_zarr("test.zarr")
# now get the next two time stamps and append
# this is the step that never completes for the real thing
xds = ds[["tmp"]].isel(time=slice(2, 4)).load();
(np.nan * xds).to_zarr("test.zarr", append_dim="time", compute=False) # <- this is the operation that never completes
# this is what I'll eventually do to actually fill the appended container with values
for i in range(2,4):
region = {
"time":slice(i, i+1),
"pfull": slice(None, None),
"grid_yt": slice(None, None),
"grid_xt": slice(None, None),
}
xds.isel(time=[i-2]).to_zarr("test.zarr", region=region)
Additional output
No response
Hi @timothyas - sorry this never got any traction. Did you manage to find a work around? I noticed this dataset is now available on GCS!
Hi @jhamman, thanks for reaching out! I haven't had time to dedicate to this, since it hasn't been on our critical path. I had success in modifying xarray-beam in order to append to a subsampled (i.e., smaller) version of the dataset, and I'm hoping that we can find a way to scale that workflow up. I can follow up on this issue with what we find.
That said, it could be the case that icechunk updates from Earthmover help resolve this? I'm looking forward to the webinar on Tuesday to learn more.
PS Yep, the more public-friendly webpage for the dataset is here and gist describing the data available on GCS here, which is also linked from that other site
@timothyas - sounds good. FWIW, I just ran your reproducer above fine on my laptop so I'd say try it again, perhaps this just works now.
That said, it could be the case that icechunk updates from Earthmover help resolve this? I'm looking forward to the webinar on Tuesday to learn more.
Glad you are going to make the webinar. Icechunk is going to be a great help for workloads like this that need to incrementally update a large dataset but want to do it safely (i.e. with transactions).
Sorry, the reproducer is unfortunately not ideal. It succeeds for me too for small-medium sized datasets, but fails at scale. I'm not sure exactly what scale the true threshold is, but it definitely fails for the ~1 PB dataset I was trying to do it with. Either way, we'll trying it with a larger cluster, and it also sounds like icechunk should help with this task. I'm excited to learn more.