presidio
presidio copied to clipboard
stable output for the encrypt operator
Is your feature request related to a problem? Please describe. for analysis purposes on fields that are anonymized using the encrypt operator, it can be helpfull to have a stable output from the operator. Right now when you run the anonymizer, you get a different result for each iteration.
Describe the solution you'd like it would be grate if a user can opt-out of the random iv in the cypher method.
Describe alternatives you've considered i tried to use hash but that can be computed back to the original value without the use of a secret.
Also discussed here https://github.com/microsoft/presidio/discussions/1006 I want to understand the usecase, is AES ECB valid as well, not requiring an IV?
yes, that is exactly the use case I am referring to. I am no cypher expert but AES ECB appears to have some weakness with regards to security. I guess that is the reason the iv is randomized right now.
https://nvd.nist.gov/vuln/detail/CVE-2022-28382#:~:text=Due%20to%20the%20use%20of,by%20observing%20repeating%20byte%20patterns.
Maybe giving the users choice of type of cypher suite used and options would be best?
So the API would look something like this:
{ "encrypt": { "key": "", "suite": "", "options": {}}}
The other option you gave in #1006 sounds good as well. Users can then base the iv on a hash or something to maintain a stable output for the same value without discarding the randomised iv.