specs icon indicating copy to clipboard operation
specs copied to clipboard

Combining Schemas or External Field Type templates?

Open khusmann opened this issue 11 months ago • 0 comments

I'm creating this thread in response to @pschumm 's comment here, so as to not pollute the original topic.

We are in the process of publishing a set of standardized table schemas on the HEAL Data Platform, each of which represents a specific, validated commonly-used measure (typically consisting of several items).

This isn't a fully-formed thought, but it strikes me that the issue of partial schemas is also related to the issue of permitting multiple (partial) table schemas per resource.

I haven't seen this elsewhere either, and I'm interested in similar functionality, so I wanted to brainstorm ways we could do this.

Borrowing @peterdesmet 's partialSchema prop, what about something like this?

{
  "partialSchema": [
    {
      "fields": [
        {
          "name": "participant_id",
          "type": "integer"
        }
      ]
    },
    "measure1.json",
    "measure2.json"
  ]
}

Where the partialSchema field, if given an array, would simply take the union of all of the schemas it was passed.

The problem with this approach is that the publisher is then stuck with the names provided by the schema. There's no way to reuse field definitions without also re-using the names of those fields.

So here's an alternative I want to propose: external field types. Basically, similar to above, but with one more layer of indirection, and the ability to "scope" the names of the included schemas:

{
  "schema": {
    "fields": [
      {
        "name": "participant_id",
         "type": "integer"
      },
      {
        "name": "question1",
        "type": "external",
        "typeRef": "measure1::item1"
      },
      {
        "name": "question2",
        "description": "This description will replace whatever the original description was in measure1::item2",
        "type": "external",
        "typeRef": "measure1::item2"
      },
      {
        "name": "question3",
        "type": "external",
        "typeRef": "measure2::item1"
      },
      {
        "name": "question4",
        "description": "This item shows how measure1::item1 can be used twice in the same resource",
        "type": "external",
        "typeRef": "measure1::item1"
      }      
    ],
    "externalFieldTypes": {
      "measure1": "measure1.json",
      "measure2": {
        "fields": [
          {
            "name": "item1",
            "description": "Description for item1 of this measure",
            "type": "integer",
            "constraints": {
              "min": 0
            }
          }
        ]
      }
    }
  }
}

Here, the new externalFieldTypes property in the schema is a map between names and referenced schemas that can have their field definitions imported into the main schema via external field types.

This way measure designers can publish definitions of their validated measures which publishers can use & link to, but can change the name of the fields (and even use the same field definition for multiple fields within a data resource).

khusmann avatar Mar 17 '24 21:03 khusmann