spatialdata
spatialdata copied to clipboard
Method to validate the relationship between elements
I discussed this with @melonora today. Relevant also to @timtreis and @sagar87. CC @giovp
I would add a method to check the consistency of the table and the elements. This function would not throw errors, but check relationships are missing/invalid. This is useful because sometimes we catch bugs only downstream (when trying to plot or aggregate something).
Not sure when we should call this method, maybe after the constructor, after reading and before saving. Or just let the user call it. The name could be validate_data_relationships().
Things that would be checked:
- [ ] The regions in
table.uns['spatialdata_attrs']['region']are present in thesdataobject. - [ ] The column with name
table.uns['spatialdata_attrs']['region_key']exists - [ ] The values of the rows in the column
table.uns['spatialdata_attrs']['region_key']are exactly the one intable.uns['spatialdata_attrs']['region']. - [ ] The column with name
table.uns['spatialdata_attrs']['instance_key']exists - [ ] The values of the rows in the column
table.uns['spatialdata_attrs']['instance_key']correspond to the value in the index of the corresponding regions. This check is done innapari_spatialdatawhen creating a shapes layer and or a labels layer. For instance this warning is given when some of the labels values and the table instance_key values don't match:
2023-04-05 15:12:39.751 | WARNING | napari_spatialdata.interactive:_find_annotation_for_labels:435 - 11050/11051 labels not annotated: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86}
- [ ] In a points object, the column with name
INSTANCE_KEYexists. - [ ] In a points object, the values of the
INSTANCE_KEYcolumn actually refer to real regions. I have just realized that there may be a bug around this, I discuss this in this issue: https://github.com/scverse/spatialdata/issues/217
I started working on this in https://github.com/scverse/spatialdata/pull/468, we should
- [ ] call this function before writing and after reading, and maybe somewhere else.