presidio icon indicating copy to clipboard operation
presidio copied to clipboard

Add all instances of previously detected strings

Open omri374 opened this issue 2 years ago • 2 comments

Presidio leverages ML models which might detect an instance in one sentence but not in another. By automatically adding all instances of a previously identified entity, we can increase detection recall (and potentially decrease precision)

Example (hypothetical):

Text: "TrueForce is an American company. Lately, TrueForce was acquired by FalseForce".

If the first "TrueForce" was detected but the second wasn't, add another RecognizerResult with the span of the second "TrueForce".

omri374 avatar May 02 '22 05:05 omri374

How about automaticly adding all detected entitties to a deny list, and doing a second analysis of the text with it?

navalev avatar Oct 23 '22 19:10 navalev

Yep that's a good approach. We can maybe create a sample for it.

omri374 avatar Oct 23 '22 19:10 omri374