data-standard icon indicating copy to clipboard operation
data-standard copied to clipboard

Develop schema, publication patterns and documentation around natural person identifiers

Open ScatteredInk opened this issue 7 years ago • 8 comments

Publishers need to locate and disambiguate identifiers for natural persons and legal entities.

  • Legal identifier lists can be found on org-id.guide. This includes both single-jurisdiction, canonical lists, fall-back lists and universal (but incomplete) lists like GLEIF.
  • There is no such common good data infrastructure for natural person identifiers. (Although information on individual types of identifiers may be available, e.g., the OECD's tax identification number page.)

For natural persons, our guidance is directing publishers to:

  • Publish natural identifiers from a limited subset of identity documents (PASSPORT, TAXID or IDCARD), prefixed with a jurisdiction code.

As publishers begin to adopt, we can:

  • consider whether this breakdown is sufficient, bearing in mind the use-case of BODS for the internal exchange of information, as well as publication.
  • look at the documentation structure and consider if we need to move to a model closer to the one used for entity identifiers.
  • gauge the demand for a common good list, with useful metadata, of natural person identifiers.

ScatteredInk avatar Nov 15 '18 09:11 ScatteredInk

From https://github.com/open-contracting/standard/issues/801: Another TYPE is needed for residence card numbers. I suggested RESCARD.

Also noting that the BODS schema recommends the use of 3 digit ISO country codes for the jurisdiction of a personal identifier scheme, whilst org-id.guide uses 2 digit codes. Do we know if that was intentional (to avoid clashes) or not?

duncandewhurst avatar Jan 21 '19 09:01 duncandewhurst

Related to #22

timgdavies avatar Mar 23 '20 17:03 timgdavies

@ajparsons has been doing work on personal identifiers as part of a Global Digital Marketplace project.

There is a write up here, but in short, it raises a couple of issues for BODS:

(1) Strengthening guidance on privacy considerations for personal identifiers

It may be appropriate to collect certain personal identifiers during BOT disclosure, but not to publish these identifiers because they are considered by the private or sensitive identifiers. Guidance should be clearer on this. (e.g. at reference.html#schema-identifier)

(2) Providing identity hints

In some cases, whilst publishing a sensitive identifier would have negative impacts on legitimate privacy, there may be approaches to allow publication of a string derived from a sensitive identifier, such that data consumers have a better chance of matching and disambiguating individuals.

Alex's paper has considered a number of options for deriving privacy-preserving identifiers, including a central API to convert sensitive identifiers into privacy-preserving identifiers. In this case, our existing identifier.scheme and identifier.id components can be used to represent a derived identifier, on the assumption that the API would provide a unique mapping between an individual and an ID.

image

However, the other option considered, of using an 'ID Fragment' is not suitable for inclusion via identifier.scheme and identifier.id as there is a chance that two different individuals could have the same identifier fragment. However, the fragment remains useful to both match and disambiguate individuals, given that:

  • Two records with the same name, but different identifier fragments are unlikely to be the same person (providing the underlying ID used to generate the fragment is a high-quality ID)
  • Two records with names that 'sound like' each other (by some reasonable matching algorithm) and that share the same identifier fragment, are highly likely to be the same person.

In these cases, we should consider whether to have an identifier.hint field, or identifier.fragment field that can optionally be provided where a full identifier.id cannot be given for privacy reasons.

timgdavies avatar May 11 '20 13:05 timgdavies

Two things from recent publisher support:

  1. We should add NATIONALID to personal ID types.

  2. I've just advised that in BODS 0.2 the best way to deal with an ID fragment is:

“identifiers”: [{
“id”: “271274-*****”,
“scheme”: “XXX-IDCARD”
}]

I think that's what we have to advise for now. Do we need to add an identifier.fragment property when an asterisk in the value is just as meaningful?

kd-ods avatar Sep 11 '20 16:09 kd-ods

I've put this issue back into the 1.0 RC milestone, but I think we need to separate out what's a requirement vs what can come later. For me, so far, that's the more specific details in the comments here, not the initial issue description:

  • Add an identifier.fragment property
  • Add RESCARD and NATIONALID examples to the guidance about personal ids
  • Reinstate guidance we used to have about what is and isn't suitable to publish (vs collect)

I don't think we've yet seen sufficient requirement for shared infrastructure ala org-id.guide which I took as the main thrust of this ticket when originally taking it out of 1.0.

Each of these could be individual tickets, but I'll leave this open as a catch-all for the time being as I think a few more similar things might crop up from our current pilots.

stevenday avatar Sep 17 '20 13:09 stevenday

@stevenday

Reinstate guidance we used to have about what is and isn't suitable to publish (vs collect)

I think that guidance still exists, doesn't it? Here: http://standard.openownership.org/en/0.2.0/schema/guidance/identifiers.html#shared-identifiers

Unless there's an edit I'm not spotting.

kd-ods avatar Sep 17 '20 15:09 kd-ods

Ah, sorry - I didn’t see it flagged at the top of the schema section for identifiers and assumed it had gone because Tim mentioned a need above. Great!

On Thu, 17 Sep 2020 at 16:58, kd-ods [email protected] wrote:

@stevenday https://github.com/stevenday

Reinstate guidance we used to have about what is and isn't suitable to publish (vs collect)

I think that guidance still exists, doesn't it? Here: http://standard.openownership.org/en/0.2.0/schema/guidance/identifiers.html#shared-identifiers

Unless there's an edit I'm not spotting.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/openownership/data-standard/issues/131#issuecomment-694330480, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFCBFHQHDCB4O6UOIVDHZDSGIWZNANCNFSM4GD5NFXQ .

--

Steve Day Technical Lead

Based in: Devon, UK [email protected] openownership.org https://www.openownership.org/ | @openownership https://twitter.com/openownership Over 90 countries have committed to beneficial ownership transparency. https://openownership.org

stevenday avatar Sep 17 '20 16:09 stevenday

Do we need to add an identifier.fragment property when an asterisk in the value is just as meaningful?

One potential issue there is that a fragment may be a hashed version of a partial ID, rather than just asterisked out (e.g. where even partial IDs have information value). There are no current examples of this however.

ajparsons avatar Sep 18 '20 08:09 ajparsons