silx icon indicating copy to clipboard operation
silx copied to clipboard

silx view does not interpret nxdata written by pni-libraries

Open jkotan opened this issue 7 years ago • 8 comments
trafficstars

Hi,

It looks like silx view is not able to interpret nxdata in a file written by python-pni, it cannot read string attributes (which are not written by h5py) so nxdata is not displayed as a plot.

An example of the file written by pni can be found here.

Bests, Jan

jkotan avatar Apr 11 '18 07:04 jkotan

@eugenwintersberger

Your attributes are not strings but arrays of strings. It is a very common mistake we have observed on files generated using C++. Please make sure you conform to the NeXus specifications.

image

vasole avatar Apr 11 '18 08:04 vasole

The attributes are stored as SIMPLE { ( 1 ) / ( H5S_UNLIMITED ) of STRSIZE H5T_VARIABLE e.g.

   ATTRIBUTE "NX_class" {
      DATATYPE  H5T_STRING {
         STRSIZE H5T_VARIABLE;
         STRPAD H5T_STR_NULLTERM;
         CSET H5T_CSET_UTF8;
         CTYPE H5T_C_S1;
      }
      DATASPACE  SIMPLE { ( 1 ) / ( H5S_UNLIMITED ) }
      DATA {
      (0): "NXroot"
      } 
  }

what in h5py is read as an attribute array. However, contrary to fix-size scalar string attributes the one created by pni have advantages, e.g. an attribute array can be easily extended.

To my knowledge the both ways of writing attributes are standard-conforming https://github.com/nexpy/nexpy/issues/146 and the linked file is a valid nexus file.

jkotan avatar Apr 11 '18 12:04 jkotan

No, they are not conforming to the standard. Your files have to be fixed.

vasole avatar Apr 11 '18 23:04 vasole

http://download.nexusformat.org/doc/html/datarules.html#nexus-data-types

"NeXus accepts both variable and fixed length strings, as well as arrays of strings. Software that reads NeXus data files should support all of these.

Some file writers write strings as a string array of rank 1 and length 1. Clients should be prepared to handle such strings."

jkotan avatar May 16 '18 08:05 jkotan

Here is the diff https://github.com/nexusformat/definitions/commit/bed843ff1895120dafd564bc82ae0dcc4b1f984e and the relative issue https://github.com/nexusformat/definitions/issues/281.

vallsv avatar May 23 '18 15:05 vallsv

It should be possible to modify the silx.io.nxdata._utils.get_attr_as_unicode function to accommodate for this with e.g., an extra output_format argument to allow converting a list of string of length 1 to a string (and raise an error when the length of the list is not 1.

Contribution welcome.

t20100 avatar Sep 19 '18 09:09 t20100

strings

NX_CHAR: The preferred string representation is UTF-8. Both fixed-length strings and variable-length strings are valid. String arrays cannot be used where only a string is expected (title, start_time, end_time, NX_class attribute,…). Fields or attributes requiring the use of string arrays will be clearly marked as such (like the NXdata attribute auxiliary_signals).

vasole avatar Mar 24 '20 12:03 vasole

Sounds better. The question remains whether we want the reading to be permissive or not.... Provided that this issue remains open for a while, it does not seems critical.

t20100 avatar Mar 25 '20 08:03 t20100