ibis icon indicating copy to clipboard operation
ibis copied to clipboard

refactor: deprecate `NA`/`fillna`/`dropna` in favor of name that uses `null` instead of `na`

Open jcrist opened this issue 2 years ago • 6 comments

Ibis has a few APIs that make use of the name "na".

  • Table.dropna for dropping rows containing null values
  • Table.fillna for replacing null values with specific values
  • ibis.NA as a null-value expression singleton

Nowhere else in ibis do we make use of the term "NA". Also, since some backends (e.g. pandas) conflate NaN and NULL, some users may expect Table.fillna to also replace NaN values in all backends (it does not). IMO we should deprecate these APIs in favor of ones that use the word "null" instead.

A few options:

Table.dropna

  • Table.dropnull?
  • Table.drop_null?
  • Table.drop_nulls? (this is the name polars uses)

My preference would be for Table.drop_null.

Table.fillna

  • Table.fillnull?
  • Table.fill_null? (this is the name polars uses)
  • Table.fill_nulls?
  • Table.ifnull? (since we have Column.nullif)

My preference would be for Table.fill_null.

ibis.NA

This one I'm tempted to remove entirely.

  • None should be valid as an input everywhere ibis.NA is
  • There's always ibis.null() for the rare cases where you need to construct a null value expression
  • If we still want a constant null, we could name it ibis.NULL instead?

jcrist avatar Oct 02 '23 15:10 jcrist

I am definitely in favor of removing ibis.NA and moving away from the na name in the fill_ and drop_ method. I'm good with drop_null and fill_null -- I am not on board with having one of them be plural and the other singular.

gforsyth avatar Oct 02 '23 16:10 gforsyth

related: #7147

I agree with everything you have here. I would lean towards fillnull and dropnull, there is plenty of existing precedent within ibis for names without underscores, and it is easier to type and readable enough. But if you went with underscores I wouldn't be furious :)

NickCrews avatar Oct 02 '23 19:10 NickCrews

+1 on consistency with plurals, no strong opinions on the separating underscore. I do find drop_null ever so slightly easier to read, but I'd also be okay with dropnull. Whoever implements this can have the final say :)

cpcloud avatar Oct 03 '23 08:10 cpcloud

I'm also in favor of the deprecations and have a slight preference towards underscore-less names.

kszucs avatar Oct 03 '23 11:10 kszucs

Folks, looking to get started with this. Couple of question on how to go about a deprecation cycle.

  1. Do we basically mark the existing API with de @deprecated(as_of="10.0", instead="use new api message") and copy the behavior with the new names?
  2. Do we duplicate the testing portion too?

ncclementi avatar May 21 '24 19:05 ncclementi

  1. I think so, yeah
  2. In the past we haven't duplicated the tests (just ported them to use the new APIs), but then added a single expression-level test (in ibis/tests/expr/...) that checks the expressions are equivalent and the deprecated one raises a warning.

jcrist avatar May 21 '24 19:05 jcrist