faker icon indicating copy to clipboard operation
faker copied to clipboard

Distribution of `pydecimal` is very far from optimal

Open sshishov opened this issue 1 year ago • 1 comments

  • Faker version: 24.14.0 (same happens on the latest version)
  • OS: MacOS (does not matter)

Distribution of pydecimal is very far from optimal which can lead to difficulty of use it in the tests. For instance, it the initial value is max_value and the updated value is also max_value then it will "break" the test because the value will not be updated.

I can recommend the following approaches (imho):

  • re-evaluate the value if it is min or max value (maybe provide special extra kwargs to support it)
  • make the logic of generation more "random" as currently it is obvious that due to overflow we set it to max value or in case of underflow to min value
  • use min and max value inside the calculation to make sure that the value will be in the boundaries during generation

Steps to reproduce

import faker
import collections
import decimal as dec

fake = faker.Faker()

counter = collections.Counter(fake.pydecimal(left_digits=0, right_digits=4, min_value=dec.Decimal('0.1'), max_value=1) for item in range(1000000))
for value, count in counter.most_common(10):
    print(value, ':', count)

Expected behavior

0.1437 : 76
0.3199 : 76
0.2477 : 75
0.7345 : 75
0.1284 : 74
0.6271 : 74
0.1597 : 74
0.4462 : 74
0.6293 : 74
0.4967 : 74

Actual behavior

1 : 500105
0.1 : 50284
0.1437 : 76
0.3199 : 76
0.2477 : 75
0.7345 : 75
0.1284 : 74
0.6271 : 74
0.1597 : 74
0.4462 : 74

sshishov avatar Aug 25 '24 09:08 sshishov

This is how we are handling it for our tests:

def get_value() -> dec.Decimal:
    """Generates real fake decimal by eliminating `min_value` and `max_value` value which is returned in case of underflow/overflow."""
    return next(
        item
        for item in iter(
            lambda: fake['en'].pydecimal(
                left_digits=0,
                right_digits=4,
                min_value=dec.Decimal('0.0001'),
                max_value=dec.Decimal(1),
            ),
            None,
        )
        if item not in {dec.Decimal('0.0001'), dec.Decimal(1)}
    )

sshishov avatar Aug 25 '24 09:08 sshishov

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Nov 24 '24 02:11 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Dec 08 '24 02:12 github-actions[bot]