whichx icon indicating copy to clipboard operation
whichx copied to clipboard

Option to build model using word hashes for model anonymity

Open rudikershaw opened this issue 3 years ago • 1 comments

If your model contains user input it is possible that sensitive user information may make it into the model. If the model is exposed (for example in the browser) it may expose this information.

When creating a new WhichX objection, we should allow a configuration option to hash all words added to the model. This will also require that words are hashed during comparison so that they can effectively be compared.

  • Specify new configuration option to allow model hashing.
  • Check this configuration before adding to the model.
  • Check this configuration before classifying against the model.

rudikershaw avatar Sep 26 '21 10:09 rudikershaw

You can use this to hash words:

export const generateHash = (value: string) => {
  let h = 0
  for (let i = 0; i < value.length; i++) h = (Math.imul(31, h) + value.charCodeAt(i)) | 0
  return h.toString(36)
}

Sharcoux avatar May 03 '22 15:05 Sharcoux