sgkit
sgkit copied to clipboard
Xarray serialization warning when saving dataset
From #785:
import sgkit as sg
import sgkit.io.vcf as sgvcf
sgvcf.vcf_to_zarr("sgkit/tests/io/vcf/data/sample.vcf.gz", "sample.vcf.gz.zarr")
ds = sg.load_dataset("sample.vcf.gz.zarr")
sg.save_dataset(ds, "sample2.vcf.gz.zarr", mode="w")
prints the warning:
SerializationWarning: variable None has data in the form of a dask array with dtype=object, which means it is being loaded into memory to determine a data type that can be safely stored on disk. To avoid this, coerce this variable to a fixed-size dtype with astype() before saving it.
There is an upstream xarray issue here: https://github.com/pydata/xarray/discussions/5769. #643 is related too.