conduit Representing structured data with equal-sized element and vertex arrays

One particular data format we want to represent in Blueprint stores the same count of vertices and elements along a dimension. (Mesh Blueprint needs to represent n+1 vertices with n elements.) This data format is a block-structured mesh with two ghost zones. Here is an example of a mesh with one element field and one vertex field. The "real" data is 3 columns by 2 rows and the format stores two ghost zones at each end of each dimension, so the element data, vertex data, and vertex positions are all size 7x6.

Here is a notional set of "native" arrays. We can't change the dimensions and we don't want to copy the arrays.

element field:

 0 0 0 0 0 0 0
 0 0 1 2 1 1 0
 0 1 3 5 3 2 0
 0 0 1 3 1 0 0
 0 0 0 1 0 0 0
 0 0 0 0 0 0 0

vertex field:

 0 0  0  0  0 0 0
 0 0  0 -1  0 0 0
 0 0 -1 -2 -1 0 0
 0 0 -1 -3 -1 0 0
 0 0 -1 -2  0 0 0
 0 0  0  0  0 0 0

vertex x-positions:

-2 -1 0 1 2 3 4 
-2 -1 0 1 2 3 4 
-2 -1 0 1 2 3 4 
-2 -1 0 1 2 3 4 
-2 -1 0 1 2 3 4 
-2 -1 0 1 2 3 4

vertex y-positions:

-1 -1 -1 -1 -1 -1
 0  0  0  0  0  0
 1  1  1  1  1  1
 2  2  2  2  2  2
 3  3  3  3  3  3
 4  4  4  4  4  4

If we ignore the ghost elements, we have enough data to represent the mesh in Blueprint. The basic idea is to add entries to the topology and each field specifying the location (in the array) of the start of data.

Here is a (proposed) Blueprint representation that does this:

coordsets:
    coords:
      type: "explicit"
      values:
        x: [-2,-1,0,1,2,3,4,
            -2,-1,0,1,2,3,4,
            -2,-1,0,1,2,3,4,
            -2,-1,0,1,2,3,4,
            -2,-1,0,1,2,3,4,
            -2,-1,0,1,2,3,4]
        y: [-1,-1,-1,-1,-1,-1,-1,
             0, 0, 0, 0, 0, 0, 0,
             1, 1, 1, 1, 1, 1, 1,
             2, 2, 2, 2, 2, 2, 2,
             3, 3, 3, 3, 3, 3, 3,
             4, 4, 4, 4, 4, 4, 4]
topologies:
    topo:
        coordset: "coords"
        type: "structured"
        elements:
            dims:
                i: 3
                j: 2
                offsets: [2,2]
                strides: [1,7]
fields:
    vert_vals:
        association: "vertex"
        topology: "topo"
        values: [0,0, 0, 0, 0,0,0,
                 0,0, 0,-1, 0,0,0,
                 0,0,-1,-2,-1,0,0,
                 0,0,-1,-3,-1,0,0,
                 0,0,-1,-2, 0,0,0,
                 0,0, 0, 0, 0,0,0]
        # note: shape is implied here from the topology as (4,3) 
        offsets: [2,2]
        strides: [1,7]
    ele_vals:
        association: "elements"
        topology: "topo"
        values: [0,0,0,0,0,0,0,
                 0,0,1,2,1,1,0,
                 0,1,3,5,3,2,0,
                 0,0,1,3,1,0,0,
                 0,0,0,1,0,0,0,
                 0,0,0,0,0,0,0]
        # note: shape is implied here from the topology as (3,2)
        offsets: [2,2]
        strides: [1,7]

This specifies a coordset named "coords" with the (native, zero-copied) array for the vertex locations in X and Y. There's also a topology called "topo". It defines a structured mesh, 3 elements in the first dimension and 2 in the second, with vertex locations stored in "coords". Blueprint will look at the data starting at (2,2), having a stride of 1 in the first dimension and 7 in the second dimension. Crucially, Blueprint won't care about or regard anything that falls outside the area in those arrays that is specified by the stride and offsets. Even though the data as stored in the entire array does not satisfy Blueprint's requirements, we can ignore the outer edge and work with the part that does satisfy Blueprint.

There are two fields, "vert_vals" over the vertices and "ele_vals" over the elements. Each field gains offsets and strides, which specify the section of the array to work with. We aren't using the offsets and strides from "topo" on the fields. This gives us flexibility and insensitivity to the in-memory layout of the array. But the fields do inherit the shape of the topology, the mesh extent in each dimension. This is because the fields and vertex locations all refer to the same underlying mesh: it would make no sense to be able to specify a different mesh shape for different fields. They all have to tie back to elements or vertices specified by the topology.

Jan 24 '22 18:01 agcapps

Looks good!

One thing I want to make sure we capture. The expected shape of the vert-assod and ele-assod fields is determined by the topology (that's a relationship we are trying to preserve)

Idea: Even before we have the NDIterator worked out, for debugging we will want a utility function that converts this style of mesh to one w/o the offsets and strides (copying the subsets necessary). This will be really helpful for debugging and verifying.

Jan 24 '22 18:01 cyrush

You wrote, "One thing I want to make sure we capture. The expected shape of the vert-assod and ele-assod fields is determined by the topology (that's a relationship we are trying to preserve)"

What does this mean if we are specifying offsets and strides for each field?

Jan 24 '22 18:01 agcapps

offsets and strides still need shape to be interpreted.

(the topo's shape can imply default offsets and strides, however offsets and strides are not useful for indexing w/o shape)

Jan 24 '22 18:01 cyrush

offsets and strides still need shape to be interpreted.

(the topo's shape can imply default offsets and strides, however offsets and strides are not useful for indexing w/o shape)

Edited the issue description to elaborate on why fields take their shape from the topology.

Jan 24 '22 19:01 agcapps

to keep things connected, these ideas are an evolution of the ideas in #755

Jan 25 '22 18:01 cyrush

@agcapps can we close this withour strided structured support?

Aug 22 '23 21:08 cyrush

We can close this.

Aug 22 '23 21:08 agcapps

thanks!

Aug 22 '23 21:08 cyrush

conduit conduit copied to clipboard

Representing structured data with equal-sized element and vertex arrays

conduit
conduit copied to clipboard