data-standard icon indicating copy to clipboard operation
data-standard copied to clipboard

Missing information: publisher codelists

Open ScatteredInk opened this issue 6 years ago • 2 comments

We have a reason but might also want to allow publishers to use short codelists, as with Companies House enumerations.

ScatteredInk avatar Aug 27 '19 14:08 ScatteredInk

On the related JIRA issue for the register's bods export, you mentioned a missingInfoCode field. However, I was thinking this might be better as a nested object with a 'reason' and 'description' (or better names) similar to how we structure unspecified relationships? It might help a little to standardise these two connected but separate sections of the standard?

stevenday avatar Aug 28 '19 09:08 stevenday

Yes - we moved missing info reasons to a nested structure in 0.2:

https://github.com/openownership/data-standard/blob/3237fd3feee6e63c52b46a9acaf698ae75f41d54/schema/ownership-or-control-statement.json#L108-L139

There is a similar nested object when the exemption or missing data is at entity level.

So I think the question is whether we want a structure like:

reason - a required field drawn from the BODS closed codelist originalReason - an optional open codelist drawn from the source system that maps to the closed BODS codelist description - an optional human-readable description, either of the codes in the open codelist or an inferred description of why the data is missing

originalReason gives analysts a quick way to search based on original data. description is readable, self-documenting but also subject to change (evidence: the Companies House repo) and harder to do analysis on. But we might want to keep it so that we understand some kinks in the data later on.

ScatteredInk avatar Aug 28 '19 10:08 ScatteredInk