presidio icon indicating copy to clipboard operation
presidio copied to clipboard

stable output for the encrypt operator

Open dgcaron opened this issue 2 years ago • 2 comments

Is your feature request related to a problem? Please describe. for analysis purposes on fields that are anonymized using the encrypt operator, it can be helpfull to have a stable output from the operator. Right now when you run the anonymizer, you get a different result for each iteration.

Describe the solution you'd like it would be grate if a user can opt-out of the random iv in the cypher method.

Describe alternatives you've considered i tried to use hash but that can be computed back to the original value without the use of a secret.

dgcaron avatar Feb 16 '23 14:02 dgcaron

Also discussed here https://github.com/microsoft/presidio/discussions/1006 I want to understand the usecase, is AES ECB valid as well, not requiring an IV?

SharonHart avatar Feb 19 '23 08:02 SharonHart

yes, that is exactly the use case I am referring to. I am no cypher expert but AES ECB appears to have some weakness with regards to security. I guess that is the reason the iv is randomized right now.

https://nvd.nist.gov/vuln/detail/CVE-2022-28382#:~:text=Due%20to%20the%20use%20of,by%20observing%20repeating%20byte%20patterns.

Maybe giving the users choice of type of cypher suite used and options would be best?

So the API would look something like this:

{ "encrypt": { "key": "", "suite": "", "options": {}}}

The other option you gave in #1006 sounds good as well. Users can then base the iv on a hash or something to maintain a stable output for the same value without discarding the randomised iv.

dgcaron avatar Feb 19 '23 09:02 dgcaron