Daft icon indicating copy to clipboard operation
Daft copied to clipboard

[FEAT] Introduce actor pool physical plan

Open jaychia opened this issue 1 year ago • 1 comments

TODOs:

General

  • [ ] Properly propagate the num_actors to the physical plan from some user-facing API (TBD)
  • [ ] Fix the translation logic to correctly account for arbitrarily nested stateful/stateless UDFs across the entire projection list. Also when running simple projections like a + 1, or .alias(), should we just do it on the actor?
  • [x] Extend the ActorPoolProject physical node to also be a Logical concept
  • [x] Explore creating a new type of PartitionTask which is a PythonUDFPartitionTask. The current PartitionTask doesn't work great for expressing the physical plan because we just tag on single "marker" no-op instruction that isn't actually useful for execution.
  • [ ] Ensure proper teardown of actor pools (especially when faced with failures)

Ray

  • [x] Properly name the actor pools for Ray debuggability
  • [ ] Fix and figure out how to make this code work for remote Ray clusters, because the resource accounting and ray actor pools are being held remotely in the SchedulerActor, but the physical plan is run (I think?) locally? I might be wrong here.
  • [ ] Properly initialize the execution config per-actor based on the current run ID, instead of naively taking it from the environment

Py

  • [ ] Properly reserve resources for PyRunner's actor pools

jaychia avatar Jul 23 '24 03:07 jaychia

Codecov Report

Attention: Patch coverage is 31.74603% with 172 lines in your changes missing coverage. Please review.

Please upload report for BASE (main@fd67dac). Learn more about missing BASE report. Report is 4 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2551   +/-   ##
=======================================
  Coverage        ?   63.28%           
=======================================
  Files           ?      982           
  Lines           ?   109417           
  Branches        ?        0           
=======================================
  Hits            ?    69240           
  Misses          ?    40177           
  Partials        ?        0           
Files Coverage Δ
daft/execution/execution_step.py 94.20% <100.00%> (ø)
daft/udf.py 92.96% <100.00%> (ø)
src/daft-dsl/src/functions/python/mod.rs 100.00% <100.00%> (ø)
src/daft-dsl/src/python.rs 93.21% <100.00%> (ø)
daft/expressions/expressions.py 93.68% <66.66%> (ø)
daft/execution/rust_physical_plan_shim.py 94.38% <33.33%> (ø)
daft/runners/runner.py 82.85% <72.72%> (ø)
src/daft-plan/src/physical_plan.rs 49.43% <28.57%> (ø)
src/daft-execution/src/stage/planner.rs 0.00% <0.00%> (ø)
src/daft-plan/src/physical_planner/translate.rs 92.15% <78.37%> (ø)
... and 5 more

codecov[bot] avatar Jul 25 '24 01:07 codecov[bot]