openPMD-api icon indicating copy to clipboard operation
openPMD-api copied to clipboard

ADIOS2 schema 2022_07_26, based on ADIOS2 modifiable attributes

Open franzpoeschel opened this issue 3 years ago • 0 comments

Based on (and to a large part compatible with) the old ADIOS2 schema 0, this PR removes schema 2021, using instead the allowModification parameter of the DefineAttribute() call in ADIOS2 to bring a better support for ADIOS2 steps (which was the main motivation for schema 2021).

Problem of schema 0: It was impossible to associate groups to single steps, making it unusable for variable-based iteration encoding. In schema 0, the group hierarchy was restored at read time indirectly by inquiring attributes and variables. Since attributes cannot be deleted in ADIOS2, this makes it impossible to delete a group once defined.

The new schema (2022) introduces a meta table for tracking active groups in the hierarchy, see an example dataset created by the variableBasedSeries test:

Step 0:
  string    /basePath                                  attr   = "/data/%T/"
  double    /data/dt                                   attr   = 1
  double    /data/meshes/E/0/position                  attr   = 0
  uint64_t  /data/meshes/E/0/shape                     attr   = 1
  double    /data/meshes/E/0/unitSI                    attr   = 1
  uint64_t  /data/meshes/E/0/value                     attr   = 0
  uint64_t  /data/meshes/E/attr_0                      attr   = 0
  string    /data/meshes/E/axisLabels                  attr   = {"x"}
  string    /data/meshes/E/dataOrder                   attr   = "C"
  string    /data/meshes/E/geometry                    attr   = "cartesian"
  double    /data/meshes/E/gridGlobalOffset            attr   = 0
  double    /data/meshes/E/gridSpacing                 attr   = 1
  double    /data/meshes/E/gridUnitSI                  attr   = 1
  float     /data/meshes/E/timeOffset                  attr   = 0
  double    /data/meshes/E/unitDimension               attr   = {0, 0, 0, 0, 0, 0, 0}
  int32_t   /data/meshes/E/x                           {1000}
  double    /data/meshes/E/x/position                  attr   = 0
  double    /data/meshes/E/x/unitSI                    attr   = 1
  int32_t   /data/meshes/E/y                           {1}
  double    /data/meshes/E/y/position                  attr   = 0
  double    /data/meshes/E/y/unitSI                    attr   = 1
  uint64_t  /data/snapshot                             attr   = 0
  double    /data/time                                 attr   = 0
  double    /data/timeUnitSI                           attr   = 1
  string    /date                                      attr   = "2022-08-17 14:59:15 +0000"
  string    /iterationEncoding                         attr   = "variableBased"
  string    /iterationFormat                           attr   = "/data"
  string    /meshesPath                                attr   = "meshes/"
  string    /openPMD                                   attr   = "1.1.0"
  uint32_t  /openPMDextension                          attr   = 0
  string    /software                                  attr   = "openPMD-api"
  string    /softwareVersion                           attr   = "0.15.0-dev"
  uint64_t  __openPMD_groups/data                      attr   = 0
  uint64_t  __openPMD_groups/data/meshes               attr   = 0
  uint64_t  __openPMD_groups/data/meshes/E             attr   = 0
  uint64_t  __openPMD_groups/data/meshes/E/0           attr   = 0
  uint64_t  __openPMD_internal/openPMD2_adios2_schema  attr   = 20220726
  uint8_t   __openPMD_internal/useSteps                attr   = 1

Step 1:
  string    /basePath                                  attr   = "/data/%T/"
  double    /data/dt                                   attr   = 1
  double    /data/meshes/E/0/position                  attr   = 0
  uint64_t  /data/meshes/E/0/shape                     attr   = 1
  double    /data/meshes/E/0/unitSI                    attr   = 1
  uint64_t  /data/meshes/E/0/value                     attr   = 0
  double    /data/meshes/E/1/position                  attr   = 0
  uint64_t  /data/meshes/E/1/shape                     attr   = 1
  double    /data/meshes/E/1/unitSI                    attr   = 1
  uint64_t  /data/meshes/E/1/value                     attr   = 1
  uint64_t  /data/meshes/E/attr_0                      attr   = 0
  uint64_t  /data/meshes/E/attr_1                      attr   = 1
  string    /data/meshes/E/axisLabels                  attr   = {"x"}
  string    /data/meshes/E/dataOrder                   attr   = "C"
  string    /data/meshes/E/geometry                    attr   = "cartesian"
  double    /data/meshes/E/gridGlobalOffset            attr   = 0
  double    /data/meshes/E/gridSpacing                 attr   = 1
  double    /data/meshes/E/gridUnitSI                  attr   = 1
  float     /data/meshes/E/timeOffset                  attr   = 0
  double    /data/meshes/E/unitDimension               attr   = {0, 0, 0, 0, 0, 0, 0}
  int32_t   /data/meshes/E/x                           {1000}
  double    /data/meshes/E/x/position                  attr   = 0
  double    /data/meshes/E/x/unitSI                    attr   = 1
  int32_t   /data/meshes/E/y                           {2, 2}
  double    /data/meshes/E/y/position                  attr   = 0
  double    /data/meshes/E/y/unitSI                    attr   = 1
  uint64_t  /data/snapshot                             attr   = 1
  double    /data/time                                 attr   = 0
  double    /data/timeUnitSI                           attr   = 1
  string    /date                                      attr   = "2022-08-17 14:59:15 +0000"
  string    /iterationEncoding                         attr   = "variableBased"
  string    /iterationFormat                           attr   = "/data"
  string    /meshesPath                                attr   = "meshes/"
  string    /openPMD                                   attr   = "1.1.0"
  uint32_t  /openPMDextension                          attr   = 0
  string    /software                                  attr   = "openPMD-api"
  string    /softwareVersion                           attr   = "0.15.0-dev"
  uint64_t  __openPMD_groups/data                      attr   = 1
  uint64_t  __openPMD_groups/data/meshes               attr   = 1
  uint64_t  __openPMD_groups/data/meshes/E             attr   = 1
  uint64_t  __openPMD_groups/data/meshes/E/0           attr   = 0
  uint64_t  __openPMD_groups/data/meshes/E/1           attr   = 1
  uint64_t  __openPMD_internal/openPMD2_adios2_schema  attr   = 20220726
  uint8_t   __openPMD_internal/useSteps                attr   = 1

Step 2:
  string    /basePath                                  attr   = "/data/%T/"
  double    /data/dt                                   attr   = 1
  double    /data/meshes/E/0/position                  attr   = 0
  uint64_t  /data/meshes/E/0/shape                     attr   = 1
  double    /data/meshes/E/0/unitSI                    attr   = 1
  uint64_t  /data/meshes/E/0/value                     attr   = 0
  double    /data/meshes/E/1/position                  attr   = 0
  uint64_t  /data/meshes/E/1/shape                     attr   = 1
  double    /data/meshes/E/1/unitSI                    attr   = 1
  uint64_t  /data/meshes/E/1/value                     attr   = 1
  double    /data/meshes/E/2/position                  attr   = 0
  uint64_t  /data/meshes/E/2/shape                     attr   = 1
  double    /data/meshes/E/2/unitSI                    attr   = 1
  uint64_t  /data/meshes/E/2/value                     attr   = 2
  uint64_t  /data/meshes/E/attr_0                      attr   = 0
  uint64_t  /data/meshes/E/attr_1                      attr   = 1
  uint64_t  /data/meshes/E/attr_2                      attr   = 2
  string    /data/meshes/E/axisLabels                  attr   = {"x"}
  string    /data/meshes/E/dataOrder                   attr   = "C"
  string    /data/meshes/E/geometry                    attr   = "cartesian"
  double    /data/meshes/E/gridGlobalOffset            attr   = 0
  double    /data/meshes/E/gridSpacing                 attr   = 1
  double    /data/meshes/E/gridUnitSI                  attr   = 1
  float     /data/meshes/E/timeOffset                  attr   = 0
  double    /data/meshes/E/unitDimension               attr   = {0, 0, 0, 0, 0, 0, 0}
  int32_t   /data/meshes/E/x                           {1000}
  double    /data/meshes/E/x/position                  attr   = 0
  double    /data/meshes/E/x/unitSI                    attr   = 1
  int32_t   /data/meshes/E/y                           {3, 3, 3}
  double    /data/meshes/E/y/position                  attr   = 0
  double    /data/meshes/E/y/unitSI                    attr   = 1
  uint64_t  /data/snapshot                             attr   = 2
  double    /data/time                                 attr   = 0
  double    /data/timeUnitSI                           attr   = 1
  string    /date                                      attr   = "2022-08-17 14:59:15 +0000"
  string    /iterationEncoding                         attr   = "variableBased"
  string    /iterationFormat                           attr   = "/data"
  string    /meshesPath                                attr   = "meshes/"
  string    /openPMD                                   attr   = "1.1.0"
  uint32_t  /openPMDextension                          attr   = 0
  string    /software                                  attr   = "openPMD-api"
  string    /softwareVersion                           attr   = "0.15.0-dev"
  uint64_t  __openPMD_groups/data                      attr   = 2
  uint64_t  __openPMD_groups/data/meshes               attr   = 2
  uint64_t  __openPMD_groups/data/meshes/E             attr   = 2
  uint64_t  __openPMD_groups/data/meshes/E/0           attr   = 0
  uint64_t  __openPMD_groups/data/meshes/E/1           attr   = 1
  uint64_t  __openPMD_groups/data/meshes/E/2           attr   = 2
  uint64_t  __openPMD_internal/openPMD2_adios2_schema  attr   = 20220726
  uint8_t   __openPMD_internal/useSteps                attr   = 1

For each IO step i and active path <p>, the value of the modifable attribute __openPMD_groups/<p> is then set as i in that step. (Note that the simpler alternative of using a boolean "active" flag openPMD_group_is_active/<p> = true or similar does not work in parallel contexts. Using the step index has the advantage that the flag needs not be re-set.) The LIST_PATHS IO task can then be implemented by using only the group table. A path exists if:

  1. Its entry in the meta table exists
  2. EITHER the file being read does not use ADIOS2 steps OR its attribute value is equivalent with the current step index

Using a table in this form allows for an algorithmically quick lookup via prefix search in a sorted map, and also visually declutters the metadata by putting the table in one block as seen in the above output of bpls -alt.

Such tricks are necessary only for paths, not for datasets, since ADIOS2 variables (i.e. datasets) are clearly associated with IO steps.

Drawback that we accept with this design: Unlike groups, an attribute once written can only be modified ~~(not yet implemented)~~, but not deleted. We need to expose the allowModification tag somehow though to enable mutable user-defined attributes (see TODOs).

TODO

  • [ ] Merge #1291 first
  • [x] Make metadata (attributes) modifiable? Ideas: JSON parameter "metadata_changes", always make constant record components modifiable, attributes mutable by default in variable-based iteration encoding
  • [x] code cleanup: remove old* names, introduce if constexpr

franzpoeschel avatar Aug 16 '22 10:08 franzpoeschel