RDT
RDT copied to clipboard
Refactoring code for Enterprise issue #529
This PR alters the logic of _reverse_transform
so that it can work naturally with the random regex transformer.
The previous logic was:
- If the amount of values the generator has left is more than the number of samples asked for, sample them all.
- If the amount of values the generator has left is less than the number of sample asked for, sample the rest of the generator.
- If
enforce_uniqueness
is enabled, then add suffixes to old values to make more. - If
enforce_uniqueness
is disabled, copy previous values
- If
This doesn't work for the random regex generator for two reasons:
- It can't always sample all the values because it has collisions
- You can't easily get all "remaining" values because technically it's unlimited.
The new logic that should work for both cases is
- Sample as many values from the generator as you can until you either get enough or hit an exception.
- The exception means the generator either ran out (in the not random case) or we had too many collisions (in the random case)
- If more values are required, add them by:
- If
enforce_uniqueness
is enabled, then add suffixes to old values to make more. - If
enforce_uniqueness
is disabled, copy previous values
- If
CU-86b04th7c
Task linked: CU-86b04th7c SDV-Enterprise - In RegexGenerator
, provide an option to generate keys in a random manner #529
I'm actually going to have to change this because I found another bug in the enterprise version that would require more refactoring here