clkhash icon indicating copy to clipboard operation
clkhash copied to clipboard

Security Docs

Open hardbyte opened this issue 6 years ago • 0 comments

We need to add a note on security...

The Cryptographic Longterm Key is computed and compared following the method described by Rainer Schnell, Tobias Bachteler, and Jörg Reiher in A Novel Error-Tolerant Anonymous Linking Code. We have deviated from their approach by using a KDF (Key derivation function) to ensure the hash functions are independent. See the detailed discussion in Who Is 1011011111…1110110010? Automated Cryptanalysis of Bloom Filter Encryptions of Databases with Several Personal Identifiers where Kroll and Steinmetzer present cryptanalysis and an attack on Bloom filters built from multiple identifiers.

Semantic Security

The semantic security of the CLKs depends on two factors:

  1. Multiple features are hashed to create the CLK. If just one or two features (e.g. name) are used, population statistics on the distribution of names can be used to identify records based solely on the CLKs. See the paper Cryptanalysis of Basic Bloom Filters Used for Privacy Preserving Record Linkage for an indepth analysis.

  2. The HMAC secret that the entity providing organizations share is unique (not reused between mappings) and is kept secret from the entity carrying out the linkage operation.

Possible Weaknesses

When creating the bi-grams, the first and last bi-gram are padded with a whitespace. This is a weakness, because it allows an attacker more easily to find the beginning and the end of a word. Need to investigate if dropping the padding decreases matching accuracy.

Aha! Link: https://csiro.aha.io/features/ANONLINK-49

hardbyte avatar Feb 14 '18 05:02 hardbyte