buckaroo icon indicating copy to clipboard operation
buckaroo copied to clipboard

integer value 99999999999999999999999 is taken as 10,000,000,000,000,000,000,000

Open nasrin1748 opened this issue 2 years ago • 7 comments

35

nasrin1748 avatar Oct 24 '23 05:10 nasrin1748

These are great bug reports! I really appreciate it, I'm digging in.

A couple of requests that will help me fix these more quickly.

  1. Can you put the actual code in text to reproduce?
  2. can you try including the following commands (picture is fine for these)
bw = BuckarooWidget(offending_df, autoType=False)
bw
#close cell

next cell

bw.origDf

and

bw = BuckarooWidget(offending_df)
bw
#close cell

next cell

bw.origDf

There are three possible issues with each one of these cases, autoTyping (python widget code), the formatter the widget code hints at in table_hints, and the behavior of the formatter in the frontend.

origDf is the serialized JSON that is sent to the frontend code. I look at that and can tell if the autoTyping is sending the wrong value to the frontend, or if the frontend is formatting it improperly.

paddymul avatar Oct 24 '23 13:10 paddymul

Check out this issue for the end state of how I want to be able to handle this https://github.com/paddymul/buckaroo/issues/74

paddymul avatar Oct 24 '23 13:10 paddymul

2

For this dataframe i created a BuckarooWidget as

bw = BuckarooWidget(offending_df, autoType=False) bw

But i am getting an error KeyError: "['mean'] not in index" for beyond 19digit number.

bw.origDf(autoType=False)

{'schema': {'fields': [{'name': 'index'}, {'name': 'Values'}]}, 'data': [{'index': 0, 'Values': 1}, {'index': 1, 'Values': 2}, {'index': 2, 'Values': 9999999999999999999}], 'table_hints': {'Values': {'is_numeric': True, 'is_integer': True, 'min_digits': 1, 'max_digits': 20, 'histogram': [{'name': 1, 'cat_pop': 33.0}, {'name': 2, 'cat_pop': 33.0}, {'name': 9999999999999999999, 'cat_pop': 33.0}, {'name': 'longtail', 'unique': 100.0}]}}}

bw.origDf

{'schema': {'fields': [{'name': 'index'}, {'name': 'Values'}]}, 'data': [{'index': 0, 'Values': 1}, {'index': 1, 'Values': 2}, {'index': 2, 'Values': 9999999999999999999}], 'table_hints': {'Values': {'is_numeric': True, 'is_integer': True, 'min_digits': 1, 'max_digits': 20, 'histogram': [{'name': 1, 'cat_pop': 33.0}, {'name': 2, 'cat_pop': 33.0}, {'name': -8446744073709551617, 'cat_pop': 33.0}, {'name': 'longtail', 'unique': 100.0}]}}}

name should same for both but it's showing different.

Javascript will round-up to the nearest possible number if the number is too big. so 9999999999999999999999999999 is taken as 1000000000000000000000000000000000000000000. Python will take the exact same but javascript won't.

nasrin1748 avatar Oct 24 '23 15:10 nasrin1748

Can you put the actual code in text to reproduce?Can you elaborate?

nasrin1748 avatar Oct 25 '23 14:10 nasrin1748

the core issue is that Javascript treats really large ints as floats, and does rounding. Look into using BigInt https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt

and hinting that BigInt is required from table_hints

paddymul avatar Oct 25 '23 16:10 paddymul

Converting to string is one option: console.log(BigInt("9999999999999999999999999999999999999999").toString())

nasrin1748 avatar Oct 26 '23 04:10 nasrin1748

not just 9999999999999999999999999 even other number are taken differently. like 33333333333333333333333333333 is taken as 333333000000000000000000000000. {'schema': {'fields': [{'name': 'index'}, {'name': 'inf'}]}, 'data': [{'index': 0, 'inf': 33333333333333333333333333333}], 'table_hints': {'inf': {'is_numeric': False, 'is_integer': False, 'min_digits': None, 'max_digits': None, 'histogram': [{'name': 33333333333333333333333333333, 'cat_pop': 100.0}, {'name': 'longtail', 'unique': 100.0}]}}}

{'schema': {'fields': [{'name': 'index'}, {'name': 'inf'}]}, 'data': [{'index': 0, 'inf': 3.333333333e+28}], 'table_hints': {'inf': {'is_numeric': True, 'is_integer': False, 'min_digits': 29, 'max_digits': 29, 'histogram': [{'name': 3.333333333e+28, 'cat_pop': 100.0}, {'name': 'longtail', 'unique': 100.0}]}}}

For smaller digits the is_numeric is taken as true and for higher digits is_numeric is taken as false. Mostly it might be the cause of the error.

nasrin1748 avatar Oct 29 '23 05:10 nasrin1748