picongpu icon indicating copy to clipboard operation
picongpu copied to clipboard

Error trying to read output data from PIConGPU 0.5.0 via openPMD-api 0.15.1

Open berceanu opened this issue 1 year ago • 8 comments

Using Python 3.11.3, openPMD-api 0.15.1 and some output simulation data from PIConGPU 0.5.0, I get the following error

>>> import openpmd_api as io
>>> s = io.Series("./simData_99998.h5", io.Access.read_only)
HDF5-DIAG: Error detected in HDF5 (1.12.2) thread 1:
  #000: H5A.c line 454 in H5Aopen(): unable to open attribute: 'unitSI'
    major: Attribute
    minor: Can't open object
  #001: H5VLcallback.c line 1091 in H5VL_attr_open(): attribute open failed
    major: Virtual Object Layer
    minor: Can't open object
  #002: H5VLcallback.c line 1058 in H5VL__attr_open(): attribute open failed
    major: Virtual Object Layer
    minor: Can't open object
  #003: H5VLnative_attr.c line 124 in H5VL__native_attr_open(): unable to open attribute: 'unitSI'
    major: Attribute
    minor: Can't open object
  #004: H5Aint.c line 423 in H5A__open(): unable to load attribute info from object header for attribute: 'unitSI'
    major: Attribute
    minor: Can't open object
  #005: H5Oattribute.c line 494 in H5O__attr_open_by_name(): can't locate attribute: 'unitSI'
    major: Attribute
    minor: Object not found
[AbstractIOHandlerImpl] IO Task READ_ATT failed with exception. Clearing IO queue and passing on the exception.
Cannot read record component 'numParticles' in particle patch and will skip it due to read error:
Read Error in backend HDF5
Object type:    Attribute
Error type:     NotFound
Further description:    [HDF5] Internal error: Failed to open HDF5 attribute 'unitSI' (/data/99998/particles/e_highGamma/particlePatches/numParticles/) during attribute read
HDF5-DIAG: Error detected in HDF5 (1.12.2) thread 1:
  #000: H5A.c line 454 in H5Aopen(): unable to open attribute: 'unitSI'
    major: Attribute
    minor: Can't open object
  #001: H5VLcallback.c line 1091 in H5VL_attr_open(): attribute open failed
    major: Virtual Object Layer
    minor: Can't open object
  #002: H5VLcallback.c line 1058 in H5VL__attr_open(): attribute open failed
    major: Virtual Object Layer
    minor: Can't open object
  #003: H5VLnative_attr.c line 124 in H5VL__native_attr_open(): unable to open attribute: 'unitSI'
    major: Attribute
    minor: Can't open object
  #004: H5Aint.c line 423 in H5A__open(): unable to load attribute info from object header for attribute: 'unitSI'
    major: Attribute
    minor: Can't open object
  #005: H5Oattribute.c line 494 in H5O__attr_open_by_name(): can't locate attribute: 'unitSI'
    major: Attribute
    minor: Object not found
[AbstractIOHandlerImpl] IO Task READ_ATT failed with exception. Clearing IO queue and passing on the exception.
Cannot read record component 'numParticlesOffset' in particle patch and will skip it due to read error:
Read Error in backend HDF5
Object type:    Attribute
Error type:     NotFound
Further description:    [HDF5] Internal error: Failed to open HDF5 attribute 'unitSI' (/data/99998/particles/e_highGamma/particlePatches/numParticlesOffset/) during attribute read
[AbstractIOHandlerImpl] IO Task OPEN_DATASET failed with exception. Clearing IO queue and passing on the exception.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: [HDF5] Unknown dataset type
>>> 

It seems PIConGPU 0.5.0 uses openPMD standard v1.0.0, which openPMD-api 0.15.1 should support?

The unitSI attribute seems to be missing from the particlePatches dataset members.

Tried with different Python versions and different version of openpmd-api, it seems to be the file itself that is the problem, since everything works fine with PIConGPU 0.6. Afaik 0.6 uses openpmd-api for i/o, right? Still, the difference in the output files seems minor, it would seem like adding a few unitSI in the right places would fix this.

berceanu avatar May 13 '23 18:05 berceanu

It might be that this was a non-standard conform openMD-api plugin implementation on PIConGPU 0.5.0. Perhaps @franzpoeschel can provide details on that. If I remember correctly, 0.5.0 still used libSplash as hdf5 output library. Thus, the adios hdf5 backend was never used extensively. Perhaps using libSplash will fix your problem. There will probably be not fix to 0.5.0.

PrometheusPi avatar May 22 '23 08:05 PrometheusPi

There seem to be multiple things going on in the error trace. The error that finally caused the crash apparently not the unitSI thing, because openPMD-api 0.15 detects that kind of error and continues parsing. The final error is [HDF5] Unknown dataset type, somewhat implying that openPMD did not recognize the type of the data in the dataset. (Ideally, this should also be ignored during parsing, but we only just started adding error recovery during parsing..).

If it's in any way possible, can you share the failing file with me? What is the output of h5dump -A simData_99998.h5?

franzpoeschel avatar May 22 '23 08:05 franzpoeschel

Hi @franzpoeschel thank you for the reply and sorry for my late response.

Here is the full file (212M). I am also attaching the h5dump output: dump.txt.

berceanu avatar Jun 27 '23 17:06 berceanu

Ok, there are indeed multiple things happening:

  1. The API looks for attributes /data/126175/particles/e_highGamma/particlePatches/numParticles/unitSI and /data/126175/particles/e_highGamma/particlePatches/numParticlesOffset/unitSI at parsing time. Seeing that they are not present, the records numParticles and numParticlesOffset are skipped. As long as you don't need these particular datasets for now, this is not too problematic as parsing still continues. The API should not require these attributes at that place as they don't really make any sense here. I will need to fix this.
  2. The second issue is what makes the parsing routine crash. According to HDFView, the radiationMask has a custom datatype 8-bit enum (0=FALSE, 1=TRUE) which the openPMD-api does not recognize. Two solutions:
    1. By changing the error type here, we can make the parser recognize this error and skip the component instead of failing the entire parsing process.
    2. When not recognizing a (custom) datatype, we could use H5Tget_super() to fall back to the parent datatype (which in this case would be a char type).

The second issue unfortunately means that there is no workaround to make the file accessible, but I'll need to add a fix. I'll try to provide a fix soon, so we can still get it in the upcoming bugfix release.

franzpoeschel avatar Jun 29 '23 16:06 franzpoeschel

For the radiation mask changing this

https://github.com/ComputationalRadiationPhysics/picongpu/blob/454a28efe6a6eb0fa1e0e97031ef2569a1ad313b/include/picongpu/param/speciesAttributes.param#L121

to uint32_t could work too. I have not tested it but if the compiler is not saying anything about the implicit conversion within the code from bool to uint32_t it should work but is not helping for already written datasets.

psychocoderHPC avatar Jun 30 '23 08:06 psychocoderHPC

https://github.com/openPMD/openPMD-api/pull/1469 https://github.com/openPMD/openPMD-api/pull/1470

These should address the issues with reading that file.

franzpoeschel avatar Jun 30 '23 10:06 franzpoeschel

I'm trying to create a lightweight dataset for testing purposes. When compiling a default Bunch or KelvinHelmholtz simulation on PIConGPU 0.5.0, the radiationMask dataset does not appear. What do I need to change? @psychocoderHPC

EDIT I think I probably found it

            /* filter to enable radiation for electrons
             *
             * to enable the filter:
             *   - goto file `speciesDefinition.param`
             *   - add the attribute `radiationMask` to the electron species
             */

franzpoeschel avatar Jul 03 '23 14:07 franzpoeschel

I'm trying to create a lightweight dataset for testing purposes. When compiling a default Bunch or KelvinHelmholtz simulation on PIConGPU 0.5.0, the radiationMask dataset does not appear. What do I need to change? @psychocoderHPC

EDIT I think I probably found it

            /* filter to enable radiation for electrons
             *
             * to enable the filter:
             *   - goto file `speciesDefinition.param`
             *   - add the attribute `radiationMask` to the electron species
             */

pic-build -t 2 will compile the example with radiation enabled. It selects the second case from the file cmakeFlags.

psychocoderHPC avatar Jul 03 '23 16:07 psychocoderHPC