node-faststats icon indicating copy to clipboard operation
node-faststats copied to clipboard

Can buckets start with #'s < 0 ??

Open sjmcdowall opened this issue 9 years ago • 5 comments

My data actually starts with negative #'s .. and I use my own bucket list .. however, the docs indicate the buckets are the UPPER limits .. and the first is 0 .. but .. that doesn't work for me .. my lower and upper limits are ALL < 0 ... -0.754556 to -0.7527556 to be exact. so my first bucket in the array I pass in is something like : -0.7456 ... (rounded). Is that going to cause an error? Am I doing something fundamentally wrong here or shouldn't buckets assume a "0" lowest # ?

sjmcdowall avatar Sep 10 '15 16:09 sjmcdowall

The buckets will actually work fine if you specify your own buckets with negative thresholds. The minimum is actually -Infinity

The actual problem you'll face is with the calculations for geometric mean & standard deviation, since those use Math.log which only works on positive real numbers.

There are a couple of ways to handle this.

  1. Change your data to always be positive (ie, just flip the sign when passing it into this library and flip it again when pulling it out.
  2. Modify the code to add an option to completely disable geometric calculations (since it doesn't make sense to use some values but not all in these calculations)
  3. If you know your numbers are always going to be negative, modify the geometric code to do the sign flipping (and of course save that state so you can reflip when returning a value).

Perhaps 2 is the best for your use case.

bluesmoon avatar Sep 10 '15 16:09 bluesmoon

Thanks for the quick response!

Well that's good to hear the buckets will work. It's not good to hear that negative values don't calc. mean and std. correct. Doh! Since I am using those actually.. and actually.. the numbers sort of look reasonable .. which is odd. I'll have to think on what to do .. maybe #1 for me is the better option..

sjmcdowall avatar Sep 10 '15 16:09 sjmcdowall

mean and standard deviation will work. geometric mean and geometric standard deviation will not work.

bluesmoon avatar Sep 10 '15 16:09 bluesmoon

Oh phew! Well then .. I am ok since I am just using the boring old mean and STD. Thanks!!

sjmcdowall avatar Sep 10 '15 16:09 sjmcdowall

If you want to fix it with method 3, the bug is in lines 131 & 132 of faststats.js: https://github.com/bluesmoon/node-faststats/blob/master/faststats.js#L131-L132 where the sign would have to be flipped. Geometric stats are only defined if all terms are of the same sign.

bluesmoon avatar Sep 10 '15 16:09 bluesmoon