pathling icon indicating copy to clipboard operation
pathling copied to clipboard

Aggregation not eliminating invalid groupings based on common ancestor

Open johngrimes opened this issue 2 years ago • 3 comments

This change relates to removing invalid combinations of grouping values when aggregations are executed over paths that share a common ancestor element.

This is show by the following example. Given a set of Observation resources:

id name.given name.family
1 Benjamin Franklin
1 Silence Dogood
2 Isaac Asimov
2 Paul French

And the following query:

{
  "resourceType": "Parameters",
  "parameter": [
    {
      "name": "aggregation",
      "valueString": "count()"
    },
    {
      "name": "grouping",
      "valueString": "name.given"
    },
    {
      "name": "grouping",
      "valueString": "name.family"
    }
  ]
}

We should get the following result:

Given name Family name Number of patients
Benjamin Franklin 1
Silence Dogood 1
Isaac Asimov 1
Paul French 1

Not this:

Given name Family name Number of patients
Benjamin Franklin 1
Benjamin Dogood 1
Silence Franklin 1
Silence Dogood 1
Isaac Asimov 1
Isaac French 1
Paul Asimov 1
Paul Frence 1

This is not technically a bug, but it is a refinement of the way that the Aggregate operation works, and it brings it in line with the Extract column joining logic.

johngrimes avatar May 22 '22 10:05 johngrimes

Possible fixes for #387 may need to take the requirements of this issue into account.

johngrimes avatar May 22 '22 10:05 johngrimes

I've added a couple of failing tests to the new branch issue/542.

johngrimes avatar May 23 '22 01:05 johngrimes

@johngrimes I think that we discussed was that

{
  "resourceType": "Parameters",
  "parameter": [
    {
      "name": "aggregation",
      "valueString": "count()"
    },
    {
      "name": "grouping",
      "valueString": "name.given"
    },
    {
      "name": "grouping",
      "valueString": "name.family"
    }
  ]
}

would actually produce the second results (with 8 rows).

But:

{
  "resourceType": "Parameters",
  "parameter": [
    {
      "name": "aggregation",
      "valueString": "name.count()"
    },
    {
      "name": "grouping",
      "valueString": "name.given"
    },
    {
      "name": "grouping",
      "valueString": "name.family"
    }
  ]
}

would produce the first one (with 4 rows).

And something like this:

{
  "resourceType": "Parameters",
  "parameter": [
    {
      "name": "aggregation",
      "valueString": "name.where($this.given = 'Paul).count()"
    },
    {
      "name": "grouping",
      "valueString": "name.given"
    },
    {
      "name": "grouping",
      "valueString": "name.family"
    }
  ]
}

Should produce:

Given name Family name Number of patients
Paul French 1

Although maybe it should be:

Given name Family name Number of patients
Benjamin Franklin 0
Silence Dogood 0
Isaac Asimov 0
Paul French 1

????

piotrszul avatar May 25 '22 23:05 piotrszul