Method to clear metadata
When messing with metadata in tables where the schema allows null entries, it is helpful to be able to clear the table. I'm currently doing:
tables.nodes.packset_metadata([b''] * tables.nodes.num_rows)
But this seems a bit hack and unintuitive. I wonder if it would be worth having a wrapper function to do this?
I'm not sure if this is a common enough editing-rows operation to special-case it?
Yes, I'm not sure. But metadata is special in a way, as it can be cleared without affecting the integrity of the tree sequence.
Any ragged column could be cleared with affecting the referential integrity. The questions is whether we can be bothered adding methods for all of them (or if they would be any use).
We could easily add a method clear_metadata here that would be inherited by all tables that have metadata, so I think that's an easy addition and not too much complexity.
We do need to thing about whether this operation should examine the schema though, and see if the result is compatible with the schema. I think probably not?
Any ragged column could be cleared with affecting the referential integrity.
True. I think I meant that it doesn't change the "look" of the tree sequence to tskit, which is not true if e.g. clearing the ancestral_state columns.
We could easily add a method
clear_metadatahere that would be inherited by all tables that have metadata, so I think that's an easy addition and not too much complexity.
Neat. I think this is reasonably useful.
We do need to thing about whether this operation should examine the schema though, and see if the result is compatible with the schema. I think probably not?
Probably not, although perhaps the only time it would fail(I think) is if it is a struct without "null" in the top level type union? So that might be an easy check?
No, there's any number of different ways it could fail I'm afraid, and without higher-level metadata APIs we're wasting our time trying to enumerate them.
No, there's any number of different ways it could fail I'm afraid, and without higher-level metadata APIs we're wasting our time trying to enumerate them.
Fine - happy to avoid this check then.
Happy with this idea - although I think it is trivial to check against the schema and worth doing. I'd add a force argument to override the check.