NCoVUtils
NCoVUtils copied to clipboard
Reference Administrative Names
Currently, many regional case count datasets are being returned from the package without clear reference to an existing geographic dataset. This means that users need to do some name matching before mapping case counts or joining them to other available datasets.
We are considering adding an iso_3166_2 field to all regional case counts to allow quick joins. This would improve the quality of the data being provided to users but involves some more work to manually match administrative names and fix administrative name matching as datasets change.
The current proposal is to create a directory in the raw-data
folder with lookup tables with two fields: name_as_recieved
and iso_3166_2
. A function can then be incorporated into existing functions that reads from this directory (hosted on github) and joins iso codes to administrative names. We can then write tests to check that names continue to match the lookup tables exactly.
I believe this would improve the usability of the data but would increase the amount of work to create a new function a bit and will also lead to more tests breaking when datasets change.
Would be good to hear how people feel about this addition, especially as we add more case counts for LMIC.
@seabbs @kathsherratt @ffinger