matnwb
matnwb copied to clipboard
[Bug]: Softlinks does not enforce the correct type, nor is it required to add a linked type as a softlink
What happened?
This issue related to the following points:
- I can add a linked type directly to an NWB type/class without using the
types.untyped.Softlink. Example (adding a device):
% Create a device
device = types.core.Device();
nwb.general_devices.set('Device', device);
imaging_plane = types.core.ImagingPlane( ...
'description', 'a very interesting part of the brain', ...
'device', device, ...
- I can add another type to the
deviceproperty using a SoftLink class.
Example:
% Create an optical channel.
optical_channel_green = types.core.OpticalChannel( ...
'description', 'green', ...
'emission_lambda', 500.);
imaging_plane = types.core.ImagingPlane( ...
'description', 'a very interesting part of the brain', ...
'device', types.untyped.SoftLink(optical_channel_green), ...
In both cases the nwb export works fine.
This issue might be related to this one: https://github.com/NeurodataWithoutBorders/matnwb/issues/48
The concept of h5 links / softlinks might be too specific for the average user to care about so maybe it would be better to handle this internally?
I.e in the set method for a linked type, if the value is not a types.untyped.SoftLink convert it to one? This would take away the burden from the user to check whether a type is linked or "embedded" (i.e an optical channel is "embedded", not linked as far as I can see). It would also ensure that it's harder to add types that are not correct, as in point 2.
Further, if the user provides an object of type types.untyped.SoftLink, the validator should check that the target of the types.untyped.SoftLink is of the correct type.
See here for three .m scripts to reproduce: https://www.dropbox.com/scl/fo/bdvm4tgc79n2w4sn4f2ij/h?rlkey=wccijdyeaxm1b5gonzwmcu0w2&dl=0
Steps to Reproduce
See linked files above
Error Message
No errors
Operating System
macOS
Matlab Version
MATLAB Version: 23.2.0.2409890 (R2023b) Update 3 MATLAB License Number: 885740 Operating System: macOS Version: 13.5 Build: 22G74 Java Version: Java 1.8.0_392-b08 with Amazon.com Inc. OpenJDK 64-Bit Server VM mixed mode
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
- [X] Have you ensured this bug was not already reported?
Also, is there a reason a softlink is only being resolved when the NWB file is being exported, and not when the SoftLink is created in the first place?
Also, is there a reason a softlink is only being resolved when the NWB file is being exported, and not when the SoftLink is created in the first place?
I actually don't know if softlink is ever technically validated. HDF5 certainly doesn't care if you write an invalid one.
Take this script for example:
fid = H5F.create('test.h5');
sl1 = types.untyped.SoftLink('/dne1');
sl1.export(fid, '/link1', {});
sl2 = types.untyped.SoftLink('/dne2');
sl2.export(fid, '/link2', {});
sl3 = types.untyped.SoftLink('/dne3');
sl3.export(fid, '/link3', {});
H5F.close(fid);
This creates a valid hdf5 file, you just get missed target paths in HDFView.
That said, you cannot actually validate if an object is "pointable" by HDF5 path until you actually write the object to the same file as the soft link which is probably why any kind of validation is done at export level. Certainly, though, someone could add some validation on the type of object when passing softlinks in, but this is also only the case if the softlink was provided with a target object instead of a raw path (some old matnwb scripts probably still use this).
I do like the idea of auto-coercing to a SoftLink but I would argue that explicitly defining the SoftLink lets the user know that the data is actually not written to disk if wrapped in one. The user still has to embed the file somewhere or you will produce a NWB file with invalid links.
Note that I will not be doing much more work on MatNWB but I leave the repo open in case PRs come in. I would imagine that the repo itself will be archived at some point in the future.
That said, you cannot actually validate if an object is "pointable" by HDF5 path until you actually write the object to the same file as the soft link which is probably why any kind of validation is done at export level.
I didn't look into this in detail, but I imagine you could easily modify the NWB File object to store an internal cache or map of all the objects that have been added to it.
I.e, when I do nwb.general_devices.set('Device', device) that could be flagged somehow on the nwb file object, and when I later add a softlink, it could validate against the objects that are already cached/flagged.
I can imagine it is not foolproof, but the alternative is to allow types to be added as groups instead of links or even adding the completely wrong types, and this is a violation of the nwb schemas.
I would argue that explicitly defining the SoftLink lets the user know that the data is actually not written to disk if wrapped in one. The user still has to embed the file somewhere or you will produce a NWB file with invalid links.
I don't think anyone who is not very familiar with h5 would catch this nuance.
Note that I will not be doing much more work on MatNWB but I leave the repo open in case PRs come in. I would imagine that the repo itself will be archived at some point in the future.
Oh, that's unfortunate, I hope there will be a way to keep it alive still!
I didn't look into this in detail, but I imagine you could easily modify the NWB File object to store an internal cache or map of all the objects that have been added to it.
With the way that MatNWB has designed assignments, keeping a map at the highest level will only work if the assignment includes the NwbFile object i.e. nwb.acquisition.set('acq', obj);. But this does not work for anything more advanced object compositions like, say Dynamic Tables. You can easily construct a DynamicTable without touching the nwb object at all (which would not be caught by subsref). I suppose you could have all classes do this and have the nwb object just merge the maps at the top when the top-level subsref receives the object. Then you could check broken soft links prior to export.
...but the alternative is to allow types to be added as groups instead of links or even adding the completely wrong types, and this is a violation of the nwb schemas.
I agree, this is an oversight in design of SoftLinks. Some low-hanging fruit for development here is to just generalize the ExternalLink validation (which ironically has better and more thorough checks) so at least the wrong type is not passed in.
Pt 2 from the first post is also true for types.untyped.ObjectView
Also here there is an opportunity for hiding the implementation of ObjectView as suggested for Softlinks
I.e in the set method for a linked type, if the value is not a types.untyped.SoftLink convert it to one? This would take away the burden from the user to check whether a type is linked or "embedded" (i.e an optical channel is "embedded", not linked as far as I can see). It would also ensure that it's harder to add types that are not correct, as in point 2.