Daft icon indicating copy to clipboard operation
Daft copied to clipboard

[Feature] Add Expression helper to fill NA

Open jaychia opened this issue 2 years ago • 0 comments

Summary

It is useful for convenience to have a .fillna function to fill all null/NaN values in a column

Proposal

df["x"].float.fillnan(0.5)
df["x"].fillnull(0.5)
df["x"].fillna(0.5)

The above expressions behave slightly differently:

  1. .float.fillnan() fills all NaN float values, and should only work for a float expression!
  2. .fillnull() fills all null values - this should work for any expression
  3. .fillna() is a convenience function that fills all null and NaN values

This should be easy to implement by aliasing (df["x"].is_null()).if_else(val, df["x"])!

Discussed in https://github.com/Eventual-Inc/Daft/discussions/482

Originally posted by jaychia January 19, 2023

Summary

It is useful to have a .fillna equivalent. We should be specific whether this is filling a NaN or a Null, however.

Proposal

e.fill_null(val) can just alias e.is_null().if_else(val, e)

jaychia avatar Feb 10 '23 17:02 jaychia