sgkit
sgkit copied to clipboard
Docs: how do I *update* a dataset on file?
I'm not finding any documentation how to do something like:
ds = sg.load_dataset(ds_path)
ds = sg.count_variant_alleles(ds)
# Don't overwrite the whole thing, just write out the new variables so I don't have to
# recompute them
sg.save_dataset(ds, ds_path)
We have how do I save but that's not helping.
#347 has some discussion and code for this. We didn't quite agree on an API, so I'd be interested to see what you think works in your situation.
Nice, I'll try that out and report back.
I've been using this pattern:
ds.update({"new_array_name": new_array})
sgkit.save_dataset(ds.drop_vars(set(ds.data_vars) - {"new_array_name"}), ds_dir, mode="a")