silx icon indicating copy to clipboard operation
silx copied to clipboard

silx view: Large 2D datasets are open

Open t20100 opened this issue 1 year ago • 7 comments

When opening a large 2D dataset containing many curves (shape: 1e6 x 3000, dtype: float32), it takes too long to display with the default view (raw?), while one want to browse the curves.

As a workaround, it is possible to open first another dataset and select the curves plot before selecting the 2D dataset.

It would be good to avoid this by either adding a confirmation before loading really large datasets or by changing the default view when the 2D data is really large.

Related to https://github.com/silx-kit/h5web/issues/1651

t20100 avatar May 22 '24 07:05 t20100

I guess the dataset does not identify itself as being of spectrum type.

What about deciding between spectrum and image based on the ratio between rows and columns when no previous visualization has taken place? I guess images will usually not have orders of magnitude between the two dimensions.

vasole avatar May 22 '24 09:05 vasole

There's no attributes or NXdata description, just a plain 2D dataset.

The ratio can be a good way to discriminate. I was also considering the number of pixels to give an idea how demanding it will be to load it. We probably don't want to display straight away a 1e6 x 1e6 pixel image either.

t20100 avatar May 22 '24 11:05 t20100

This issue is the same for MX data produced by Dectris. Silx is essentially unusable for MX.

woutdenolf avatar Oct 12 '24 10:10 woutdenolf

@woutdenolf

Does the problem also affect PyMca? I would like to take a look at the offending datasets. Among other possibilities, the problem might be related to compression in huge chunks.

vasole avatar Oct 13 '24 18:10 vasole

From what I know, there's currently 2 issues with silx view regarding large datasets:

  • Opening a 2D dataset that is too large to be displayed as an image (e.g., a stack of 1e6 azimuthal integrations).
  • Opening a 3D dataset as a stack of images to display one slice yet in one of the view it loads the whole stack... The code of silx view is meant to avoid loading all the data, but it looks to be quite fragile + depending on the Nexus @interpretation it picks a different view to display to data (see #4167). @woutdenolf, I think it is more this issue you are facing with dectris files.

@vasole, it depends which part of silx pymca is using:

  • For the first issue, unless pymca checks the size of the dataset before displaying it, it will have the same issue.
  • For the second issue, from what I recall, the issue is in the way to map the axes from the 3D dataset to what is displayed and how this gets used... so it should be specific to silx view.

IMO both issues should be fixed for the next release.

t20100 avatar Oct 14 '24 07:10 t20100

This issue is the same for MX data produced by Dectris. Silx is essentially unusable for MX.

Well, PyMca only has issues when accessing the file via silx (show information, double click...)

When accessing the target file from silx, everything works at the silx side. When accessing the file via the master file, silx hangs.

There is a badly configured NXdata group too.

vasole avatar Oct 14 '24 10:10 vasole

What you see is most likely related to #4167: If silx view picks the StackView widget for display, it loads the whole stack (while it should not!)...

t20100 avatar Oct 14 '24 12:10 t20100