silx
silx copied to clipboard
silx view: Added slicing support
Merge PR #3665 first!
This PR adds a first support of slicing to silx view command line argument.
It does so by adding a h5py-like class wrapping the sliced dataset, and making silx.io.open returning it if there is some slicing.
It also adds a DataUrl.data_slice_string() method and fixes the generation of a representation string from the data_slice.
It still needs some testing.
related to #3615 and #3577
I added support for shorter syntax to provide slicing.
The initial syntax is: file.h5?path=/path/to/data&slice=1, but it is cumbersome to write on the command line: it's long and the & requires to surround the URL with ".
I propose here to use the URL fragment # for slicing: file.h5?/path/to/data#slicing (and file.h5::/path/to/data#slicing).
There is a discussion of different options here: https://github.com/silx-kit/silx/issues/3577#issuecomment-1044241996.
Some alternatives:
file.h5?/path/to/data[slicing](andfile.h5::/path/to/data[slicing])file.h5?/path/to/data#slice=slicing(and file.h5::/path/to/data#slice=slicing`)
What do you think?
I improved the lazy loading so that data is loaded only when needed (before it was loading it for everything, like getting dtype or shape) and is no longer cached.
I'm not sure about: eaefdf0. It simplifies the parameters for DatasetSlice.__init__, but in the event of an external HDF5 link, the initial file and data path are not preserved, instead the dataset uses the "physical" file and path (i.e, it resolves the external link).
The ? in the command line was rejected long ago because it is interpreted under windows. At the time I saw something like hdf5:// to specify URIs but that it is not relevant in this discussion.
The hash # for slicing is fine with me.
There was a pbm with silx view and the refresh button when a subdataset was displayed.
Have you checked such action without your new feature? In case?
Nice.
some general comments
I think it would be very beneficial that the 'slice' information is displayed somewhere (at the tree item or at the dataset path) otherwise we can get quickly confused like when launching something like:
silx view bambou_hercules_0001.h5?/?.1/measurement/pcolinux#0 bambou_hercules_0001.h5?/?.1/measurement/pcolinux#-1
comments on the 'imaging' point of view
If we go for a 'imageJ' like display like comparing first frame and last frame from an acquisition over several acquisition I think we might consider:
- being able to provide several 'slice' into the same '#' request (using ; or & for example). Because now if we go for
'silx view bambou_hercules_0001.h5?/?.1/measurement/pcolinux#0,-1'
we open the last row of the first frame (which make sense for sure).
-
active (display) automatically the first item according to nexus parameters (or maybe a mode if provided like
curve,image...) -
(highly oriented imaging-tomography) in the case several frames are provided I guess the default behavior would be to have one tree item per dataset (so with two or more frames). For imaging I guess users would like to be able to switch quickly from one frame to other (so having one tree item per frame I guess)...
The ? in the command line was rejected long ago because it is interpreted under windows.
Good to know ! Same for & in bash
There was a pbm with silx view and the refresh button when a subdataset was displayed. Have you checked such action without your new feature?
Yes, it's in PR #3665
I think it would be very beneficial that the 'slice' information is displayed somewhere
I completely agree, I wanted to do so, but it's not straight forward to implement, but I'll try again.
being able to provide several 'slice' into the same '#' request (using ; or & for example).
It should be possible to mimic numpy slicing, that would be 'silx view bambou_hercules_0001.h5?/?.1/measurement/pcolinux#[0,-1]' It's in the "spirit" of genericity, but that's not what users want to type I guess.
active (display) automatically the first item according to nexus parameters (or maybe a mode if provided like curve, image...)
Agreed.
(highly oriented imaging-tomography) in the case several frames are provided I guess the default behavior would be to have one tree item per dataset (so with two or more frames). For imaging I guess users would like to be able to switch quickly from one frame to other (so having one tree item per frame I guess)...
Then we may want to make a dedicated CLI for imaging... or find a way to express the slicing as a sort for template in silx view command line...
I added a --slice to silx view CLI option to pass indices of slices to open.
--slice SLICES [SLICES ...]
List of slice indices to open (Only for dataset)
e.g., silx view myfile.h5:/mydata --slice 0 -1.
Each slice declared this way is opened as a separate entry in silx view.
For the generic case, it is combined with the slice provided in the filename to open (e.g., file.h5::/data/path#slice).
There is a limitation of argparse nargs='+' in that --slice SLICES should be at the end of the command line or it tries to parse whatever comes after as int....
@payno would that better fit the need? There is also a separated issue, in that then you cannot quickly browse and visualize one image after another since selecting it it the tree does not display it...
I just tested it, looks very nice. :clap:
Not having the display 'directly' is not that much a problem. Otherwise if this is a huge issue they will let us know.
What is (was?) missing is the handling of the nexus parameters. Especially the "interpretation" one. I tried to fixed it on PR I did on your branch: https://github.com/t20100/silx/pull/32 Looks like this is working but I don't know much about this part of the source code so please double check.
Otherwise LGTM.
Thanks!
I didn't considered providing a Nexus group, only accessing dataset. What would be needed from NeXus? Because since it slices the dataset, the dimensionality is changed and e.g., axes are no longer correct.
From my point of view the most important is the "interpretation". So we can try to display images for example in the case of tomography. Otherwise users will always have to select the image display.
OK, I see then it's a dataset attribute.
I'll check the behaviour when we there is not enough remaining dimensions and maybe filter out all the other attributes for now to be sure.
I think this is taking the wrong path, it's kind of twisting the arm of silx view to fit specific needs into it.
It would be simpler for both usage and maintenance to have a specific application for inspecting stack of images (either a separate application or a silx sub-command, e.g., silx images, silx stack or whatever). This way we can tune the behavior of the application without having to fit into a generic tool, yet re-using silx widgets/functions.
I'd be in favor of NOT integrating this into v1.1.0.
| It would be simpler for both usage and maintenance to have a specific application for inspecting stack of images
I do not consider it a priority for 1.1.0 nor for silx itself because the problem of handling a stack of images is old and there are many applications available in Python, Java, MatLab, IDL, ...
As far as I recall, to access a particular dataset (not slice), we had agreed to use filename.h5::/path/to/group_or_dataset
I hope the entries in this discussion that contain a single : are just typos and that things like 'silx view bambou_hercules_0001.h5?/?.1/measurement/pcolinux#[0,-1]' will contain the missing :: if they ever get implemented (I hope not because those ? can be problematic).
If at a certain point you need to perform multiple selections, remember that at some places we have already implemented the --filepattern approach...
Yes, : is a typo, the separator currently use is :: and there is no reason to change it.
Also ? is already supported because urls like silx://path/file.h5?path=/data/path are supported, but I agree it should not be the only option because ? and & can be problematic on the command line.
I recall @vallsv had foreseen two options, one internal and one for the command line.
Yes i remember there was 2 syntaxes, one which can be parsed with URL stuffs, and another one to simplify the command line.
I agree, i me pretty sure the single : after the filename is a typo.
I recall having some concerns (corner cases, complexity) with this PR... It really needs a second look.
I'm closing this PR in favor of PR #3677 which provides the --slice option that is extracted from here.
Thanks for the discussions