It seem to be inconsistent to me that all groupBy endpoints except the ratio/groupBy/boundary are structured like this:

{
groupByResult:[ 
                   result:[ ... ],
                   groupByObject
]
}

only the ratio/groupBy/boundary uses 'groupByBoundaryResult' instead of 'groupByResult' like this:

{
groupByBoundaryResult:[ 
                   ratioResult:[ ... ],
                   groupByObject
]
}

The "normal" /groupBy/boundary does not use the groupByBoundaryResult but the common groupByResult as all the others.

To me there seems no need for different naming of the groupByResult.

Jul 15 '20 14:07 mcauer

Behind each of these result objects is normally a Java class that has specific attributes, which are reflected by the nested objects here one level deeper in the JSON. So as you can see the response of the /ratio endpoint includes a "ratioResult" JSON array, whereas the other ones include a general "result" JSON array. I think the different naming of the outter object here is a legacy issue from back when we removed the /share endpoint and all its corresponding classes. Back then I combined the remaining /groupBy endpoints to use that one groupByResult, but apparently either forgot to adapt them for the /ratio endpoint or had a reason in the code to distinguish them somehow. It's true though that from a users perspective, this is ambiguous and should not be like that. Will investigate 🔍

Jul 15 '20 14:07 FabiKo117

From a user perspective I would suggest for the sake of simple response parsing to use the same JSON paths for all groupBy respones such that you always have:

- response
  - groupByResult
    - groupByObject
    - result
      - {timestamp, value} OR {timestamp, value, value2, ratio} OR {fromTimestamp, toTimestamp, value}

Jul 15 '20 15:07 mcauer

I made an inspection on what response structure is used where and summarized it as follows:

Current (V1) aggregation response structures

1. `DefaultAggregationResponse` is used by:

/elements/<area|count|length|perimeter>
/elements/<area|count|length|perimeter>/density
/users/count
/users/count/density

- response
    - result[]
      - {timestamp, value} OR {fromTimestamp, toTimestamp, value}

2. `RatioResponse` is used by:

/elements/count/ratio

- response
    - ratioResult[]
      - {timestamp, value, value2, ratio} # while value is the denominator and value2 is the nominator of the ratio (value2/value = ratio)

3. `GroupByResponse` is used by:

/elements/<area|count|length|perimeter>/groupBy/<boundary|key|tag|type>
/elements/<area|count|length|perimeter>/density/groupBy/<boundary|tag|type>
/users/count/groupBy/<boundary|key|tag|type>
/users/count/density/groupBy/<boundary|tag|type>

- response
  - groupByResult[]
    - groupByObject: string
    - result[]
      - {timestamp, value} OR {fromTimestamp, toTimestamp, value}

4. Nested `GroupByResponse`:

/elements/count/groupBy/boundary/groupBy/tag
/elements/count/density/groupBy/boundary/groupBy/tag

Note: the difference to normal groupBy is just the groupByObject to be an array of strings instead of a single string

- response
  - groupByResult[]
    - groupByObject: string[]
    - result[]
      - {timestamp, value}

5. `RatioGroupByBoundaryResponse` is used by:

/elements/count/ratio/groupBy/boundary

- response
  - groupByBoundaryResult[]
    - groupByObject: string
    - ratioResult[]
      - {timestamp, value, value2, ratio} # while value is the denominator and value2 is the nominator of the ratio (value2/value = ratio)

Jul 16 '20 07:07 mcauer

I think for a future v2 version we should consider to simplify this into only 2 structures as proposed here:

Proposed reduction to only two response structures

Only keep two main structures which contain the 3 different result types (snapshot-value, interval-value, snapshot-ratio)

1. Plain results without grouping

- response
    - result[]
      - {timestamp, value} OR {fromTimestamp, toTimestamp, value} OR {timestamp, value, value2, ratio}

2. Grouped results with one, two and potential N groupings

corresponds to the current structure for nested groupBy, which could also be used for the "normal" groupBy) for more than 2 groupings one would just have to add further strings to the groupByObject-array for the current normat single grouping, there would be just one entry in the array

- response
  - groupByResult[]
    - groupByObject: string[]
    - result[]
      - {timestamp, value} OR {fromTimestamp, toTimestamp, value} OR {timestamp, value, value2, ratio}

Jul 16 '20 07:07 mcauer

I just took a look in the code and tweaked it a bit. The RatioResult can indeed be directly included in the standard Result object for plain /ratio requests. For /ratio/groupBy/boundary requests though, it's not so easily changeable because we use the timestamp in the response for the GeoJSON response format.

See e.g. such a request The timestamp is used here as identification together with the spatial objects. I don't say that it's not possible though. Can sit down again next week on some day and continue with it. And this could also be something for a 1.1 release, don't think that we'd have to wait for v2 to include this update in the running instance.

Jul 17 '20 15:07 FabiKo117

And this could also be something for a 1.1 release, don't think that we'd have to wait for v2 to include this update in the running instance.

:thinking: IMHO, this should be only changed in a major release, even though this is indeed a very small change and wouldn't change the logic behind the resource: It does change the data interface however and for an end-user it would be very annoying if a working query "suddenly" produces a result which cannot be processed with the existing scripts any more. I experienced a similar issue recently in another project, where a single escape character got changed in the data from an external data provider causing hard to debug follow-up errors down the line (https://github.com/hotosm/osm-analytics-cruncher/issues/20#issuecomment-662340775).

Jul 23 '20 10:07 tyrasd

For /ratio/groupBy/boundary requests though, it's not so easily changeable because we use the timestamp in the response for the GeoJSON response format. […] The timestamp is used here as identification together with the spatial objects.

I don't see why this would be much different from the non-ratio request with groupBy/boundary (example). Isn't that one also using the timestamp to distinguish the different result features from each other? Yet, it can still use the same groupByResult field in the format=json output. Or am I overlooking something?

Jul 23 '20 10:07 tyrasd

Why are there different object names in the response of /groupBy/boundary and /ratio/groupByBoundary?

Current (V1) aggregation response structures

1. DefaultAggregationResponse is used by:

2. RatioResponse is used by:

3. GroupByResponse is used by:

4. Nested GroupByResponse:

5. RatioGroupByBoundaryResponse is used by:

Proposed reduction to only two response structures

1. Plain results without grouping

2. Grouped results with one, two and potential N groupings

1. `DefaultAggregationResponse` is used by:

2. `RatioResponse` is used by:

3. `GroupByResponse` is used by:

4. Nested `GroupByResponse`:

5. `RatioGroupByBoundaryResponse` is used by: