silx icon indicating copy to clipboard operation
silx copied to clipboard

silx view: Added slicing support

Open t20100 opened this issue 3 years ago • 18 comments
trafficstars

Merge PR #3665 first!

This PR adds a first support of slicing to silx view command line argument. It does so by adding a h5py-like class wrapping the sliced dataset, and making silx.io.open returning it if there is some slicing. It also adds a DataUrl.data_slice_string() method and fixes the generation of a representation string from the data_slice.

It still needs some testing.

related to #3615 and #3577

t20100 avatar Sep 19 '22 15:09 t20100

I added support for shorter syntax to provide slicing.

The initial syntax is: file.h5?path=/path/to/data&slice=1, but it is cumbersome to write on the command line: it's long and the & requires to surround the URL with ".

I propose here to use the URL fragment # for slicing: file.h5?/path/to/data#slicing (and file.h5::/path/to/data#slicing).

There is a discussion of different options here: https://github.com/silx-kit/silx/issues/3577#issuecomment-1044241996.

Some alternatives:

  • file.h5?/path/to/data[slicing] (and file.h5::/path/to/data[slicing])
  • file.h5?/path/to/data#slice=slicing (and file.h5::/path/to/data#slice=slicing`)

What do you think?

t20100 avatar Sep 20 '22 11:09 t20100

I improved the lazy loading so that data is loaded only when needed (before it was loading it for everything, like getting dtype or shape) and is no longer cached.

I'm not sure about: eaefdf0. It simplifies the parameters for DatasetSlice.__init__, but in the event of an external HDF5 link, the initial file and data path are not preserved, instead the dataset uses the "physical" file and path (i.e, it resolves the external link).

t20100 avatar Sep 21 '22 15:09 t20100

The ? in the command line was rejected long ago because it is interpreted under windows. At the time I saw something like hdf5:// to specify URIs but that it is not relevant in this discussion.

The hash # for slicing is fine with me.

vasole avatar Sep 26 '22 08:09 vasole

There was a pbm with silx view and the refresh button when a subdataset was displayed.

Have you checked such action without your new feature? In case?

vallsv avatar Sep 26 '22 08:09 vallsv

Nice.

some general comments

I think it would be very beneficial that the 'slice' information is displayed somewhere (at the tree item or at the dataset path) otherwise we can get quickly confused like when launching something like:

silx view bambou_hercules_0001.h5?/?.1/measurement/pcolinux#0 bambou_hercules_0001.h5?/?.1/measurement/pcolinux#-1

comments on the 'imaging' point of view

If we go for a 'imageJ' like display like comparing first frame and last frame from an acquisition over several acquisition I think we might consider:

  • being able to provide several 'slice' into the same '#' request (using ; or & for example). Because now if we go for
'silx view bambou_hercules_0001.h5?/?.1/measurement/pcolinux#0,-1'

we open the last row of the first frame (which make sense for sure).

  • active (display) automatically the first item according to nexus parameters (or maybe a mode if provided like curve, image...)

  • (highly oriented imaging-tomography) in the case several frames are provided I guess the default behavior would be to have one tree item per dataset (so with two or more frames). For imaging I guess users would like to be able to switch quickly from one frame to other (so having one tree item per frame I guess)...

payno avatar Sep 26 '22 13:09 payno

The ? in the command line was rejected long ago because it is interpreted under windows.

Good to know ! Same for & in bash

There was a pbm with silx view and the refresh button when a subdataset was displayed. Have you checked such action without your new feature?

Yes, it's in PR #3665

t20100 avatar Sep 26 '22 14:09 t20100

I think it would be very beneficial that the 'slice' information is displayed somewhere

I completely agree, I wanted to do so, but it's not straight forward to implement, but I'll try again.

being able to provide several 'slice' into the same '#' request (using ; or & for example).

It should be possible to mimic numpy slicing, that would be 'silx view bambou_hercules_0001.h5?/?.1/measurement/pcolinux#[0,-1]' It's in the "spirit" of genericity, but that's not what users want to type I guess.

active (display) automatically the first item according to nexus parameters (or maybe a mode if provided like curve, image...)

Agreed.

(highly oriented imaging-tomography) in the case several frames are provided I guess the default behavior would be to have one tree item per dataset (so with two or more frames). For imaging I guess users would like to be able to switch quickly from one frame to other (so having one tree item per frame I guess)...

Then we may want to make a dedicated CLI for imaging... or find a way to express the slicing as a sort for template in silx view command line...

t20100 avatar Sep 26 '22 15:09 t20100

I added a --slice to silx view CLI option to pass indices of slices to open.

  --slice SLICES [SLICES ...]
                        List of slice indices to open (Only for dataset)

e.g., silx view myfile.h5:/mydata --slice 0 -1.

Each slice declared this way is opened as a separate entry in silx view. For the generic case, it is combined with the slice provided in the filename to open (e.g., file.h5::/data/path#slice).

There is a limitation of argparse nargs='+' in that --slice SLICES should be at the end of the command line or it tries to parse whatever comes after as int....

@payno would that better fit the need? There is also a separated issue, in that then you cannot quickly browse and visualize one image after another since selecting it it the tree does not display it...

t20100 avatar Sep 28 '22 12:09 t20100

I just tested it, looks very nice. :clap:

Not having the display 'directly' is not that much a problem. Otherwise if this is a huge issue they will let us know.

What is (was?) missing is the handling of the nexus parameters. Especially the "interpretation" one. I tried to fixed it on PR I did on your branch: https://github.com/t20100/silx/pull/32 Looks like this is working but I don't know much about this part of the source code so please double check.

Otherwise LGTM.

payno avatar Sep 29 '22 12:09 payno

Thanks!

I didn't considered providing a Nexus group, only accessing dataset. What would be needed from NeXus? Because since it slices the dataset, the dimensionality is changed and e.g., axes are no longer correct.

t20100 avatar Sep 29 '22 13:09 t20100

From my point of view the most important is the "interpretation". So we can try to display images for example in the case of tomography. Otherwise users will always have to select the image display.

payno avatar Sep 29 '22 13:09 payno

OK, I see then it's a dataset attribute.

I'll check the behaviour when we there is not enough remaining dimensions and maybe filter out all the other attributes for now to be sure.

t20100 avatar Sep 29 '22 13:09 t20100

I think this is taking the wrong path, it's kind of twisting the arm of silx view to fit specific needs into it.

It would be simpler for both usage and maintenance to have a specific application for inspecting stack of images (either a separate application or a silx sub-command, e.g., silx images, silx stack or whatever). This way we can tune the behavior of the application without having to fit into a generic tool, yet re-using silx widgets/functions.

I'd be in favor of NOT integrating this into v1.1.0.

t20100 avatar Oct 11 '22 10:10 t20100

| It would be simpler for both usage and maintenance to have a specific application for inspecting stack of images

I do not consider it a priority for 1.1.0 nor for silx itself because the problem of handling a stack of images is old and there are many applications available in Python, Java, MatLab, IDL, ...

vasole avatar Oct 11 '22 11:10 vasole

As far as I recall, to access a particular dataset (not slice), we had agreed to use filename.h5::/path/to/group_or_dataset

I hope the entries in this discussion that contain a single : are just typos and that things like 'silx view bambou_hercules_0001.h5?/?.1/measurement/pcolinux#[0,-1]' will contain the missing :: if they ever get implemented (I hope not because those ? can be problematic).

If at a certain point you need to perform multiple selections, remember that at some places we have already implemented the --filepattern approach...

vasole avatar Oct 11 '22 12:10 vasole

Yes, : is a typo, the separator currently use is :: and there is no reason to change it.

Also ? is already supported because urls like silx://path/file.h5?path=/data/path are supported, but I agree it should not be the only option because ? and & can be problematic on the command line.

t20100 avatar Oct 11 '22 14:10 t20100

I recall @vallsv had foreseen two options, one internal and one for the command line.

vasole avatar Oct 11 '22 14:10 vasole

Yes i remember there was 2 syntaxes, one which can be parsed with URL stuffs, and another one to simplify the command line.

I agree, i me pretty sure the single : after the filename is a typo.

vallsv avatar Oct 12 '22 07:10 vallsv

I recall having some concerns (corner cases, complexity) with this PR... It really needs a second look.

t20100 avatar May 12 '23 15:05 t20100

I'm closing this PR in favor of PR #3677 which provides the --slice option that is extracted from here. Thanks for the discussions

t20100 avatar Jun 15 '23 09:06 t20100