Daft feat: add forward and backward strategies for fill

Changes Made

Add forward and backward strategies for fill_null. Kept backwards compatibility for the existing fill_null by either passing no strategy or strategy == 'value'.

Related Issues

Closes #4683

Checklist

[x] Documented in API Docs (if applicable)
[x] Documented in User Guide (if applicable)
[x] If adding a new documentation page, doc is added to docs/mkdocs.yml navigation
[x] Documentation builds and is formatted properly (tag @/ccmao1130 for docs review)

Jul 15 '25 18:07 nsalerni

Two failing checks are failing on main prior to this PR.

Jul 15 '25 20:07 nsalerni

Codecov Report

:x: Patch coverage is 78.02198% with 60 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 78.76%. Comparing base (87cab23) to head (c145ad2). :warning: Report is 512 commits behind head on main.

Files with missing lines	Patch %	Lines
src/daft-core/src/join.rs	50.00%	11 Missing :warning:
src/daft-dsl/src/expr/display.rs	0.00%	11 Missing :warning:
src/daft-logical-plan/src/partitioning.rs	0.00%	10 Missing :warning:
src/daft-logical-plan/src/ops/project.rs	0.00%	9 Missing :warning:
src/daft-core/src/series/ops/null.rs	92.52%	8 Missing :warning:
daft/series.py	85.71%	3 Missing :warning:
src/daft-dsl/src/expr/mod.rs	92.50%	3 Missing :warning:
src/daft-dsl/src/visitor.rs	0.00%	3 Missing :warning:
daft/expressions/expressions.py	94.11%	1 Missing :warning:
src/daft-recordbatch/src/lib.rs	88.88%	1 Missing :warning:

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4776      +/-   ##
==========================================
- Coverage   78.90%   78.76%   -0.15%     
==========================================
  Files         886      886              
  Lines      123324   124296     +972     
==========================================
+ Hits        97313    97901     +588     
- Misses      26011    26395     +384

Files with missing lines	Coverage Δ
src/daft-core/src/lib.rs	`100.00% <100.00%> (ø)`
src/daft-core/src/python/series.rs	`90.47% <100.00%> (+0.37%)`	:arrow_up:
src/daft-dsl/src/python.rs	`79.38% <100.00%> (+0.45%)`	:arrow_up:
daft/expressions/expressions.py	`96.32% <94.11%> (-0.06%)`	:arrow_down:
src/daft-recordbatch/src/lib.rs	`85.76% <88.88%> (-0.02%)`	:arrow_down:
daft/series.py	`91.86% <85.71%> (-0.26%)`	:arrow_down:
src/daft-dsl/src/expr/mod.rs	`79.53% <92.50%> (+0.58%)`	:arrow_up:
src/daft-dsl/src/visitor.rs	`56.22% <0.00%> (-0.49%)`	:arrow_down:
src/daft-core/src/series/ops/null.rs	`92.37% <92.52%> (-7.63%)`	:arrow_down:
src/daft-logical-plan/src/ops/project.rs	`58.17% <0.00%> (-0.11%)`	:arrow_down:
... and 3 more

... and 18 files with indirect coverage changes

:rocket: New features to boost your workflow:

:snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
:package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Jul 15 '25 21:07 codecov[bot]

Hi @nsalerni, thank you for contributing this! I took a quick look at this PR, the main issue is that this implementation is not fully correct, as it only fills nulls within a RecordBatch, and does not carry them over.

For example, if we set the RecordBatch sizes to 1 using default_morsel_size, you will see this behavior:
>>> import daft
>>> daft.context.set_execution_config(default_morsel_size=1)
DaftContext(_ctx=<builtins.PyDaftContext object at 0x102b7a5f0>)
>>> df = daft.from_pydict({"a": [1, None]})
>>> df.select(df["a"].fill_null(strategy="forward")).show()
╭───────╮
│ a     │
│ ---   │
│ Int64 │
╞═══════╡
│ 1     │
├╌╌╌╌╌╌╌┤
│ None  │
╰───────╯

(Showing first 2 of 2 rows)
Fixing this is not trivial and requires building up mechanisms in our physical pipelines. This is actually something we also want to build for our non-partitioned window functions as well, and would be a significant enhancement to Daft! So if you are up to it, let's talk more. I am happy to help guide you

I have some other comments too but they are minor and I will leave them for a more thorough review.

@kevinzwang happy to take a first go, let me ping you in the daft slack ! 😄

Jul 16 '25 00:07 nsalerni

Hi @nsalerni 👋 Thanks for this contribution! Just checking in here -- how is it going addressing the review feedback?

Oct 21 '25 21:10 malcolmgreaves

Hi @nsalerni - Checking to see if this is something you are still able to push through.

Nov 18 '25 20:11 madvart

feat: add forward and backward strategies for fill_null

Changes Made

Related Issues

Checklist

Codecov Report