ibis feat: avoid pandas converting float("nan") to NULL in memtable

feat: avoid pandas converting float("nan") to NULL in memtable

Open NickCrews opened this issue 8 months ago • 0 comments

Is your feature request related to a problem?

In pandas, NaNs are treated as NULL. This means that, because we use pandas to create a dataframe during memtable creation, if a user specifies a float("nan"), they get a NULL. In my opinion, ideal behavior would be that they get a true NAN. Maybe related comment in duckdb.

ibis.memtable({"f": [None, float("-inf"), 3.0, float("inf"), float("nan")]}).f
┏━━━━━━━━━┓
┃ f       ┃
┡━━━━━━━━━┩
│ float64 │
├─────────┤
│    NULL │
│    -inf │
│     3.0 │
│     inf │
│    NULL │
└─────────┘

What is the motivation behind your request?

This came up for me when I wanted to test nan vs null behavior in https://github.com/ibis-project/ibis/issues/11029, but it seems like a basic IO operation we should suport.

Describe the solution you'd like

pyarrow does the conversion right. Could we use that and avoid pandas? It looks like it would be a hassle because we are using pandas.DataFrame(), which is a catchall that accepts many different shapes of data. If we used pyarrow, we have to determine which of the pa.Table.from_dicts, pa.Table.from_lists, etc to use. And even then there might be formats we don't support. Of course, if we don't support them, then we could error, and tell the user they need to do ibis.memtable(pd.DataFrame(your_data)) themselves manually and are responsible for the weirdness of pandas.

What version of ibis are you running?

main

What backend(s) are you using, if any?

duckdb, but I think this should affect all

Code of Conduct

[x] I agree to follow this project's Code of Conduct

Apr 03 '25 17:04 NickCrews

ibis ibis copied to clipboard

feat: avoid pandas converting float("nan") to NULL in memtable

Is your feature request related to a problem?

What is the motivation behind your request?

Describe the solution you'd like

What version of ibis are you running?

What backend(s) are you using, if any?

Code of Conduct

ibis
ibis copied to clipboard