RDT icon indicating copy to clipboard operation
RDT copied to clipboard

Refactoring code for Enterprise issue #529

Open amontanez24 opened this issue 9 months ago • 2 comments

This PR alters the logic of _reverse_transform so that it can work naturally with the random regex transformer.

The previous logic was:

  1. If the amount of values the generator has left is more than the number of samples asked for, sample them all.
  2. If the amount of values the generator has left is less than the number of sample asked for, sample the rest of the generator.
    • If enforce_uniqueness is enabled, then add suffixes to old values to make more.
    • If enforce_uniqueness is disabled, copy previous values

This doesn't work for the random regex generator for two reasons:

  1. It can't always sample all the values because it has collisions
  2. You can't easily get all "remaining" values because technically it's unlimited.

The new logic that should work for both cases is

  1. Sample as many values from the generator as you can until you either get enough or hit an exception.
    • The exception means the generator either ran out (in the not random case) or we had too many collisions (in the random case)
  2. If more values are required, add them by:
    • If enforce_uniqueness is enabled, then add suffixes to old values to make more.
    • If enforce_uniqueness is disabled, copy previous values

CU-86b04th7c

amontanez24 avatar Apr 30 '24 02:04 amontanez24

I'm actually going to have to change this because I found another bug in the enterprise version that would require more refactoring here

amontanez24 avatar May 01 '24 15:05 amontanez24