anonlink-entity-service icon indicating copy to clipboard operation
anonlink-entity-service copied to clipboard

Document assumptions about the input error characteristics

Open hardbyte opened this issue 5 years ago • 0 comments

As each PII field may have different types of errors (e.g. missing data, transcription error, entirely changed data...) we need to document any built in assumptions.

  • Matching common names: "Wu", "Smith" etc
  • Switching fields: "Dexter Cody" vs "Cody Dexter"
  • Edits which preserve letter frequency but not bigrams: "Gerg" vs "Greg"
  • Address changes: "123 blah lane" vs "84 another street"
  • Format issues: "14/8/2018" vs "8/14/2018"

I've put the issue here but it might make more sense in clkhash.

cc: @wilko77 @nbgl

Aha! Link: https://csiro.aha.io/features/ANONLINK-13

hardbyte avatar Aug 13 '18 23:08 hardbyte