openPMD-standard Move mesh geometry elsewhere

This issue is a bit messy, sorry. My ideas are not clear yet, but I hope constructive discussion will follow.

Two issues #141 & #142 bring changes to the mesh structure: meshes will not be spatial any more (but of any type) and their spacings may not be regular. Who knows, maybe unstructured mesh will be added some day. All this might introduce a significant overhead, both in terms of memory and in terms of saturating the standard with complexity. The geometry attribute, and associated attributes thus becomes too restrictive. The difference between cartesian and thetaMode will only appear on spatial coordinates, which will be a special case in v2.0, and does not provide any information on the spacings.

It is true that one could provide a ton of details in geometryParameters in order to indicate how to reconstruct the final view. However, this will look like a bit bloated and not convenient for parsing.

I suggest that the geometry information be spread among (optional) details, potentially moved to an extension. I am not sure yet what the best approach would be. Note that these changes impact most the attributes in a mesh record: geometry, geometryParameters, axisLabels, gridSpacing, gridGlobalOffset. Some of these do not always make sense in all geometries.

Some things to think about:

I see 2 locations to move the geometry attributes: (1) a new extension, but some attributes are still necessary in the mesh record to indicate where to find the information so I am not sure this helps at all, or (2) geometry=some/link/to/a/dataset containing a bunch of attributes; this can be re-used anywhere the same geometry appears. Another possibility is simply to keep everything at the same place as v 1.0, but make some attributes required only with some geometries.
Maybe some inspiration from VTK would be useful. We could categorize geometries like in a tree. There would be two main categories "structured" and "unstructured". The former will have "rectilinear" and "curvilinear" subcategories, etc.
For each kind of geometry we can think of, we need to list the necessary attributes. The problem with attributes is that they are not necessarily the same for all types of geometry, so all the attributes of a mesh record should be moved together with this geometry attribute. For example, cartesian geometry would need a very basic gridSpacing (one spacing for each dimension), but a spherical or rectilinear geometry would need a list of values in each dimension. For unstructured grids, this could be even more complex, and gridGlobalOffset might not even make sense.
Concerning the attribute axisLabels, we have to be careful about what it really means. Is it the label of the raw data axes? Or the label that should be plotted once the data is constructed? This can make a serious difference when the geometry is not cartesian, or when some construction is necessary (like thetaMode).

Feb 26 '18 10:02 mccoys

Thank you for documenting your ideas!

I think moving geometry generally into an extension would be marvelous, even independent of the special spatial case of thetaMode.

Such an extension would and should be very similar to existing visualization markup standards (see e.g. here). Most of them are for a good reason using VTK-naming which has a great hierarchy of various structured and unstructured meshes.

Btw, another reason to move it into an extension is that geometry of the data does not imply visualization geometry. Take for example the thetaMode or a cylindrical symmetric mesh: in real-space one could do a full 3D reconstruction to look at it (because it's only a "compression scheme" for full 3D geometry under certain assumptions of symmetry) but one can also decide to look at individual modes or 2D cuts of reconstructed slices alone. The same is true for other geometries: projections and subsets in different geometrical representation do make sense and are not necessarily the same as the raw data geometry.

That said, in my opinion geometry attributes are an essential hint on what the data starts from for various geometric representations that one can then do with it in tools.

Feb 26 '18 10:02 ax3l

Here is a list of mesh types that I can envision

cartesianND (meaning actually several geometries cartesian1D, etc.)
- spacings: N floats, or N lists of floats in case of irregular
- offsets: N floats
- labels: N strings
- data: N-dimensional array of arbitrary size
cylindrical (has three coordinates r , theta and z)
- spacings: 3 floats, or 3 lists of floats in case of irregular
- offsets: 3 floats
- labels: 3 strings
- data: 3-dimensional array of arbitrary size
spherical (has three coordinates r , theta and phi)
- spacings: 3 floats, or 3 lists of floats in case of irregular
- offsets: 3 floats
- labels: 3 strings
- data: 3-dimensional array of arbitrary size
unstructured, for some other day ;)

The geometry thetaMode would not be defined explicitely: it is only a special case of cylindrical, where the array has only one point along theta (=0). Defining modes can be done as currently (in another argument).

A polar geometry could also be seen as a special case of cylindrical where z=0.

See additional considerations in #189

Feb 26 '18 14:02 mccoys

openPMD-standard openPMD-standard copied to clipboard

Move mesh geometry elsewhere

openPMD-standard
openPMD-standard copied to clipboard