openPMD-standard icon indicating copy to clipboard operation
openPMD-standard copied to clipboard

Cross-Record Grouping

Open DavidSagan opened this issue 7 years ago • 5 comments

Right now there is only one root group for meshes: /data/meshes/. As opposed to particles where where there can be many: /data/particles/electrons/, /data/particles/protons/, etc.

This is restricting. For example it may be desirable to store internal fields separately from external fields. So to have a structure to enable groups like: /data/meshes/dc_field/, /data/meshes/particle_field/`, would be desirable.

@ax3l: Please put this in the 2.0 project.

DavidSagan avatar Dec 10 '17 03:12 DavidSagan

@DavidSagan thank you for the proposal!

Can you elaborate a little on what is missing? Can't this be realized with individual names of meshes for external fields already as in your example?

Why would we need to group the fields further instead of just naming them as a domain-expert would understand?

ax3l avatar Dec 12 '17 09:12 ax3l

@ax3l

Looking at this issue again I might be confusing the base standard with the ED-PIC extension. In any case, if I look at the example openPMD datasets I see paths like /data/400/fields/E and there is no way to specify multiple E-fields since meshesPath is fields. My proposal is to add a directory level so that you could have /data/400/fields/dc-field/E, /data/400/fields/ac-field/E, etc.

Why would we need to group the fields further instead of just naming them as a domain-expert would understand?

I don't understand this.

DavidSagan avatar Dec 12 '17 16:12 DavidSagan

@DavidSagan I think that Axel's point is that you can always use: /data/400/fields/Efield_dc and /data/400/fields/Efield_ac (which would be compatible with the current standard) instead of /data/400/fields/dc-field/E and /data/400/fields/ac-field/E.

Would that be okay with you?

RemiLehe avatar Dec 12 '17 18:12 RemiLehe

@RemiLehe

I think that Axel's point is that you can always use: /data/400/fields/Efield_dc and /data/400/fields/Efield_ac (which would be compatible with the current standard) instead of /data/400/fields/dc-field/E and /data/400/fields/ac-field/E.

If I look at "Naming Conventions for mesh records (field records)" section of the EXT_ED-PIC extension it looks like /data/400/fields/Efield_dc would not be compatible since, in this example, meshesPath is set to "fields/". But this is an extension standard and it looks like the base standard does not impose this so all this may just be my confusing the extension with the base.

DavidSagan avatar Dec 12 '17 18:12 DavidSagan

After thinking about it: an alternative to prefixes/suffixes in names and nested groups, which are imho harder to parse for readers and complicate things, we could also go for the following: we allow to each mesh (and particle) record to add an optional attribute, say "grouping"/"groupIdentifier"/"group", that can carry an arbitrary, user-defined identifier (e.g. we decide for strings).

With such, parsing without a need for groups is as easy as it is now and one can get simple additional info if needed. Furthermore, we could allow a semicolon-separated list of strings as identifiers, allowing to group a record to more than one group, which is a nice addition.

Also, I want to keep nested sub-directories open for meshes because we will need it heavily for more complicated geometries & AMR.

Example

let's allow a general group string-attribute witch all records, including meshes and particles.

The group can be user-given and just needs to match to define a group between arbitrary records. If we define again ; as a special character, we can allow to add a N:M relation of records in user-defined groups with this attribute. Automated reading can just build a map of groups to lookups of record names.

ax3l avatar Jan 23 '18 12:01 ax3l