grass icon indicating copy to clipboard operation
grass copied to clipboard

r.report: json output

Open cwhite911 opened this issue 1 year ago • 8 comments

I've designed four possible JSON schemas for r.report.

Option 1

{
    "location": "nc_spm_08_grass7",
    "created": "Fri Dec 6 17:00:21 2013",
    "region": {
        "north": 279073.97546639,
        "south": 113673.97546639,
        "east": 798143.31179672,
        "west": 595143.31179672,
        "sn-res": 200,
        "ew-res": 200,
    },
    "mask": null,
    "totals": {
        "sqmi": 77.60, 
        "acres": 49668.182
    },
    "maps": [
        {
            "title": "South-West Wake county",
            "description": "geology derived from vector map",
            "layer": "geology_30m",
            "type": "raster",
        }
    ],
    "categories": [
        {
            "category": 217,
            "description": "CZfg",
            "sqmi": 27.78,
            "acres": 17781.703,
            "categories": [
                {
                    "category": 1,
                    "description": "developed",
                    "sqmi": 18,
                    "acres": 17781.703,
                }
            ],
        }
    ]
}

Option 2

{
    "location": "nc_spm_08_grass7",
    "created": "Fri Dec 6 17:00:21 2013",
    "region": {
        "north": 279073.97546639,
        "south": 113673.97546639,
        "east": 798143.31179672,
        "west": 595143.31179672,
        "sn-res": 200,
        "ew-res": 200,
    },
    "mask": null,
    "maps":  {
        "geology_30m_raster": {
            "title": "South-West Wake county",
            "description": "geology derived from vector map"
        }
    },
    "categories": {
        "217": {
            "description": "CZfg",
            "sqmi": 27.78,
            "acres": 11781.703,
            "categories": {
                "1": {
                    "description": "developed", 
                    "sqmi": 18,
                    "acres": 17781.703
                  }
            }
        },
        "total": {
            "sqmi": 77.60, 
            "acres": 49668.182
        }
    }
}

Option 3

{
    "location": "nc_spm_08_grass7",
    "created": "Fri Dec 6 17:00:21 2013",
    "region": {
        "north": 279073.97546639,
        "south": 113673.97546639,
        "east": 798143.31179672,
        "west": 595143.31179672,
        "sn-res": 200,
        "ew-res": 200,
    },
    "mask": null,
     "maps": [
        {
            "title": "South-West Wake county",
            "description": "geology derived from vector map",
            "layer": "geology_30m",
            "type": "raster",
        }
    ],
    "totals": {
        "sqmi": 77.60, 
        "acres": 49668.182
    },
    "fields": [ "description", "sqmi", "acres"],
    "categories": {
        "217": {
            "values": [ "CZfg", 27.78, 11781.703 ],
            "categories": {
                "1": {
                    "values": [ "developed", 18, 17781.703 ]
                }
            }
        }
    }
}

My personal preference is option 2 because it avoids the use of arrays.

cwhite911 avatar Jun 06 '23 12:06 cwhite911

Option 4

{
    "location": "nc_spm_08_grass7",
    "created": "Fri Dec 6 17:00:21 2013",
    "region": {
        "north": 279073.97546639,
        "south": 113673.97546639,
        "east": 798143.31179672,
        "west": 595143.31179672,
        "sn-res": 200,
        "ew-res": 200,
    },
    "mask": null,
    "maps":  {
        "geology_30m_raster": {
            "name": "South-West Wake county",
            "description": "geology derived from vector map"
        }
    },
   "fields": [ "label", "sqmi", "acres"],
   "total": {
            "sqmi": 77.60, 
            "acres": 49668.182
     },
    "category_order": ["217"],
    "categories": {
        "217": {
            "label": "CZfg",
            "sqmi": 27.78,
            "acres": 11781.703,
            "category_order": ["1"],
            "categories": {
                "1": {
                    "label": "developed", 
                    "sqmi": 18,
                    "acres": 17781.703
                  }
            }
        }
    }
}

cwhite911 avatar Jun 06 '23 16:06 cwhite911

If the goal of the work for adding json output is to make it machine readable, then I think that in these schemas, the date in the created field should be in ISO 8601 as it the best consensus as the json date format.

The date format in the sample schemas seems to be the same as the text output of r.report, but that doesn't mean that it is the best choice for the new output.

echoix avatar Jun 06 '23 22:06 echoix

A useful exercise to help defining the format is to try creating a class in your favorite typed object oriented language that models that information, using the simplest/easiest implementation you can think of that you could use in an app. It will shed light on some weird constructs. In the end, the goal is to serialize that information in another app right?

Once finished, by trying to define a jsonschema file, if it isn't easy to write a schema that can validate correctly the json, maybe the structure has weird dependencies.

For example, in option 3 there is a pattern that would definitely be problematic to deserialize/validate: the "field" is a list of field names, and then "values" is an array containing the values for these fields (if I understand well), and the relation between the two is their index inside the list.

It should be normal that the json might seem verbose, but the goal is for a machine serializable format ;)

echoix avatar Jun 06 '23 23:06 echoix

I agree, and you are right the date used in the example is an artifact from copying and pasting from r.report's output.

cwhite911 avatar Jun 14 '23 20:06 cwhite911

@cwhite911 I like Option 1 and Option 2, most of the web APIs I have worked with in past were a form of these.

I dislike 3 because of fields key and the use of arrays for values, it makes it necessary to consult to multiple keys in the JSON to correlate the data. Also, it is usually harder to work with arrays having mixed content (string and double in the example) in type safe languages.

I dislike 4 because of the use of category_order field, I field arrays naturally fit use case for ordering so if we need some data to be ordered we should use an array.

So if the categories/maps fields need to be ordered, I would suggest we go with option 1 otherwise option 2.

kritibirda26 avatar Jun 18 '24 16:06 kritibirda26

@kritibirda26 I like option 1, but I would tweak the area responses from acres ect.. to an array with the following object schema.

{
    "unit": string
    "value": double
}

cwhite911 avatar Jun 21 '24 16:06 cwhite911

Ah, so it would allow to represent even something completely defined in map units, for example some custom unknown projection?

echoix avatar Jun 21 '24 16:06 echoix

@cwhite911 sounds good, should I open a new PR copying the the relevant content from this PR or something?

kritibirda26 avatar Jun 21 '24 16:06 kritibirda26

The work continued in now merged #3935.

wenzeslaus avatar Aug 23 '24 17:08 wenzeslaus