datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

feat: Implement shared memory pool for case where spark.memory.offHeap.enabled=false

Open andygrove opened this issue 1 year ago • 2 comments

Which issue does this PR close?

Closes https://github.com/apache/datafusion-comet/issues/996

Rationale for this change

Simplify memory configuration.

What changes are included in this PR?

Allocate one shared pool per executor, rather than one pool per native plan, when spark.memory.offHeap.enabled=false.

How are these changes tested?

  • Existing tests
  • Ran TPC-H benchmarks

andygrove avatar Oct 08 '24 00:10 andygrove

I'm a bit worried about this approach because we are implementing greedy mode inside CometTaskMemoryManager, which is known to starve consumers frequently. I prefer using fair spill pool for "native memory management" mode. This makes spillable operators work properly without being starved but with the cost of memory pool under-utilization.

Kontinuation avatar Oct 08 '24 07:10 Kontinuation

I'm a bit worried about this approach because we are implementing greedy mode inside CometTaskMemoryManager, which is known to starve consumers frequently. I prefer using fair spill pool for "native memory management" mode. This makes spillable operators work properly without being starved but with the cost of memory pool under-utilization.

Thanks for the feedback. I will work on a separate PR for the fair spill approach. I am moving this PR to draft for now.

andygrove avatar Oct 08 '24 16:10 andygrove

Closing in favor of https://github.com/apache/datafusion-comet/pull/1021

andygrove avatar Oct 16 '24 21:10 andygrove