clkhash icon indicating copy to clipboard operation
clkhash copied to clipboard

CLK hash: hash pii for entity matching

CLK Hash

Python implementation of cryptographic longterm key hashing. clkhash supports Python versions 3.6+

This is as described by Rainer Schnell, Tobias Bachteler, and Jörg Reiher in A Novel Error-Tolerant Anonymous Linking Code

codecov Documentation Status Unit Testing Typechecking Requirements Status Downloads

Installation

Install clkhash with all dependencies using pip:

pip install clkhash

Documentation

https://clkhash.readthedocs.io

clkhash api

To hash a CSV file of entities using the default schema:

from clkhash import clk, randomnames
fake_pii_schema = randomnames.NameList.SCHEMA
clks = clk.generate_clk_from_csv(open('fake-pii-out.csv','r'), 'secret', fake_pii_schema)

Citing

Clkhash, and the wider Anonlink project is designed, developed and supported by CSIRO's Data61 <https://www.data61.csiro.au/>__. If you use any part of this library in your research, please cite it using the following BibTex entry::

@misc{Anonlink,
  author = {CSIRO's Data61},
  title = {Anonlink Private Record Linkage System},
  year = {2017},
  publisher = {GitHub},
  journal = {GitHub Repository},
  howpublished = {\url{https://github.com/data61/clkhash}},
}