connector-x icon indicating copy to clipboard operation
connector-x copied to clipboard

[PostgreSQL] thread '<unnamed>' panicked when selecting data with small values (e.g. 1e-130)

Open chienhsiang opened this issue 2 years ago • 11 comments

What language are you using?

Python.

What version are you using?

python 3.10.8 and connectorx 0.3.1

What database are you using?

PostgreSQL

What dataframe are you using?

Pandas and Arrow

Can you describe your bug?

My table contains some data with small values (e.g. 8.5e-130) When selecting that row, I ran into the error:

thread <unnamed> panicked at 'attempt to divide by zero'

What's the minimal number that connectx supports?

chienhsiang avatar Jan 06 '23 03:01 chienhsiang

Hi @chienhsiang , can you provide a simple example for reproducing? Including the table creation, value insertion and the select query that caused the error.

wangxiaoying avatar Jan 09 '23 22:01 wangxiaoying

Hi @wangxiaoying, thank you for your response!

Create a test table:

psql -d postgres -c "DROP TABLE IF EXISTS test_table; CREATE TABLE test_table (key int, value numeric);"
psql -d postgres -c "INSERT INTO test_table VALUES (0, 1.3), (1, 0.004), (2, 1e-20), (3, 1e-130), (4, 1e-200);"

Fetch data:

import connectorx as cx

uri = "postgresql://..."

# This works
cx.read_sql(uri, "select * from test_table where key < 3")

# This fails
cx.read_sql(uri, "select * from test_table where key >= 3")

# This fails
cx.read_sql(uri, "select * from test_table where key = 3")

# This fails
cx.read_sql(uri, "select * from test_table where key = 4")

Feel free to let me know if anything is unclear! Thank you for looking into this!

chienhsiang avatar Jan 10 '23 00:01 chienhsiang

Hi @chienhsiang , thanks for the examples! The code panics at rust_decimal when we converting the decimal to float64. I opened an issue there.

wangxiaoying avatar Jan 12 '23 05:01 wangxiaoying

This bug has been tackled in a PR of upstream lib rust_decimal by returning 0 instead of throwing an error in this case. Currently the precision of decimal cannot proceed 28 (more details can be found in the comment). I will update the dependency once it has been published in cargo.

wangxiaoying avatar Jan 14 '23 00:01 wangxiaoying

I see. Thank you for the help!

chienhsiang avatar Jan 14 '23 02:01 chienhsiang

Hi @wangxiaoying, the dependency has been published in cargo, can you please update it? Thank you for the help!

ghilesmeddour avatar Jan 26 '23 13:01 ghilesmeddour

@ghilesmeddour thanks for the reminder. Can you try 0.3.2-alpha.2? @chienhsiang

wangxiaoying avatar Jan 26 '23 23:01 wangxiaoying

Hi @wangxiaoying,

Thank you for the update! The error is gone in 0.3.2a. However, it converts the small number to zero, which is not ideal for me. Those numbers are p-values, so it is crucial not to round them to zero. Maybe throw a warning or error which explicitly states that those numbers are outside the precision range that connectorx currently supports? Then users can decide whether to switch to, say, psycopg2 if they need the precision in those cases.

Thanks again for your help!

chienhsiang avatar Jan 27 '23 00:01 chienhsiang

Hi @chienhsiang , just like I mentioned earlier, the small numbers will be converted to zero, which is done by the upstream library rust_decimal since it does not support precision larger than 28 in the current version. It is reasonable to have a warning for this but I think maybe it would be better done in the upstream library instead of in connectorx. I will mention this in the document.

wangxiaoying avatar Jan 27 '23 19:01 wangxiaoying

Hi @wangxiaoying, I fully agree! A warning and documentation will be super helpful! Hopefully rust_decimal will soon support scientific notation calculation with small numbers. Thank you so much!

chienhsiang avatar Jan 27 '23 22:01 chienhsiang

Seeing the same thing where the postgres type is numeric with no precision or scale specified and values have been inserted that exceed the rust decimals precision/scale.

I just cast them on the server (i.e. cast(data as numeric(18,6))) - I agree a warning that the upstream datatype cannot be represented fully would be helpful

david-waterworth avatar Mar 07 '23 04:03 david-waterworth