faker datatype.number does not always generate numbers with expected precision

datatype.number does not always generate numbers with expected precision

Open pkuczynski opened this issue 2 years ago • 7 comments

Describe the bug

Due to the limitations of float arithmetics the precision param is not always well reflected in the generated output.

Reproduction

> faker.datatype.number({ precision: 0.01 })
64246.18
> faker.datatype.number({ precision: 0.001 })
75598.72
> faker.datatype.number({ precision: 0.0001 })
15719.824
> faker.datatype.number({ precision: 0.00001 })
57268.07910000001

Additional Info

Suggested solution would be to use a better math library offering arbitrary-precision decimal arithmetic, eg. https://www.npmjs.com/package/big.js

Jan 28 '22 09:01 pkuczynski

After I had some look into this, I noticed this only happens with precision of 0.0001. Then I came up with some simple code to just truncate the extra digits from the output. This way, we won't have to use Math.round() or some library and the results will work with seed too.

The code is in javascript, and probably not perfect and certainly a hacky approach, but it does cover all edge cases:

function getDecimals(a) {
  if (!isFinite(a)) return 0;
  let e = 1, p = 0;
  while (Math.round(a * e) / e !== a) { e *= 10; p++; }
  return p;
}

function toFixed(x) {
  if (Math.abs(x) < 1.0) {
    const e = parseInt(x.toString().split('e-')[1]);
    if (e) {
        x *= Math.pow(10,e-1);
        x = '0.' + (new Array(e)).join('0') + x.toString().substring(2);
    }
  } else {
    const e = parseInt(x.toString().split('+')[1]);
    if (e > 20) {
        e -= 20;
        x /= Math.pow(10,e);
        x += (new Array(e+1)).join('0');
    }
  }
  return x;
}

function toFixedTrunc(x, n) {
  x = toFixed(x) 
  const v = (typeof x === 'string' ? x : x.toString()).split('.');
  if (n <= 0) return v[0];
  let f = v[1] || '';
  if (f.length > n) return `${v[0]}.${f.substr(0,n)}`;
  while (f.length < n) f += '0';
  return `${v[0]}.${f}`;
}

const { faker } = require('@faker-js/faker');
const precision = 0.00001;
const decimals = getDecimals(precision);

function generate() {
  const generatedNumber = faker.datatype.number({ precision });
  return getDecimals(generatedNumber) <= decimals ? generatedNumber : toFixedTrunc(generatedNumber, decimals)
}

let counter = 0;
while(counter < 10) {
  console.log(generate())
  counter++;
}


/*
Output: 

32320.83572
37150.50227
74120.61501
97321.44276
86947.56555
98595.50214
24725.63345
40594.42884
921.41321
41778.05577
*/

repl here: https://replit.com/@luciferreeves/FrozenNiftyCategories#index.js

Jan 28 '22 18:01 luciferreeves

The issue is not only on 0.0001. Another example:

faker.datatype.number({ precision: 0.000000001 })
> 43583.241103992004

12 precision points, when 9 was requested. Big.js is not a large library and I think it does not make sense for us to battle with float arithmetics.

Jan 28 '22 19:01 pkuczynski

Even though the problem is with { precision: 0.000000001 } too, the code above should still work for all precision values. That's why I provided the repl, try changing the precision there. Here are the results with { precision: 0.000000001 }:

45543.57002466
54939.625723624
50.909680645
1796.946656181
43156.280539308
37206.544535025
77680.240670616
50875.845546187
61613.707753117
94161.122681139

Update:

I modified the generate method in the code above to take a precision:

function generate(precision) {
  const generatedNumber = faker.datatype.number({ precision });

  // This code from here goes into the faker.datatype.number implementation:
  const decimals = getDecimals(precision);
  return getDecimals(generatedNumber) <= decimals ? generatedNumber : toFixedTrunc(generatedNumber, decimals)
}

then, I added a simple test, generated 10000 numbers of the same precision and I did it for all precisions (upto 16 decimal places) and none of them fail:

// Here we test the precisions
let precision = 1;
let precisionArray = [];
let ctr = 0;
while(ctr !== 17) {
  precisionArray.push(precision);
  // this is equivalent of doing precision = precision/10 but with accurate results.
  precision = parseFloat(`0.${'0'.repeat(ctr)}1`);
  ctr += 1;
}

precisionArray.forEach(p => {
  process.stdout.write(`Precision: ${p.toExponential()} \t`)
  const fail = false;
  let counter = 1

  while(counter <= 10000) {
    const num = generate(precision);
    if (getDecimals(num) > getDecimals(precision)) {
      console.log(`Test ${counter} failed for precision ${precision}, generated number: ${num}.`)
      fail = true;
      break;
    }
    counter++;
  }

  if(!fail) {
    process.stdout.write(`PASS \n`)
  } else {
    process.stdout.write(`FAIL \n`)
  }
});

Output:

Precision: 1e+0     PASS 
Precision: 1e-1     PASS 
Precision: 1e-2     PASS 
Precision: 1e-3     PASS 
Precision: 1e-4     PASS 
Precision: 1e-5     PASS 
Precision: 1e-6     PASS 
Precision: 1e-7     PASS 
Precision: 1e-8     PASS 
Precision: 1e-9     PASS 
Precision: 1e-10    PASS 
Precision: 1e-11    PASS 
Precision: 1e-12    PASS 
Precision: 1e-13    PASS 
Precision: 1e-14    PASS 
Precision: 1e-15    PASS 
Precision: 1e-16    PASS

I also updated the repl, so you can try this out yourself. I think this method fixes the bug.

Jan 28 '22 20:01 luciferreeves

How many digits of precision should we support? And how thoroughly do we support these (e.g. precision such as 0.0003)

Apr 05 '22 16:04 ST-DDT

https://en.wikipedia.org/wiki/Double-precision_floating-point_format

IMO we should support a max precision of 15 digits, definitely 14 digits. Everything more precise I would highly suggest to use a different library that is more targeted to the use-case.

Apr 05 '22 20:04 Shinigami92

If we would limit ourselves to 12 digits, then a simple regex replace would do the job for us.

Apr 05 '22 21:04 ST-DDT

Any number would do for me as long as it works ;)

Apr 05 '22 21:04 pkuczynski

Heads up here, but I made this ticket a few weeks ago on the same matter. Are we expected to see this change in the next update?

Jan 19 '24 15:01 clocke3

Fixed by #2581 at least for precisions in the form of 10^-n.

#2581

This will be in the next release (v8.4).

Jan 19 '24 23:01 ST-DDT

faker faker copied to clipboard

datatype.number does not always generate numbers with expected precision

Describe the bug

Reproduction

Additional Info

Update:

faker
faker copied to clipboard