incubator-gluten
incubator-gluten copied to clipboard
[VL] Support file cache spill in Gluten
Description
Velox backend provides 2-level file cache (AsyncDataCache
and SsdCache
) and we have enabled it in PR, using a dedicated MMapAllocator
initialized with configured capacity. This part of memory is not counted by execution memory or storage memory, and not managed by Spark UnifiedMemoryManager
. In this ticket, we would like to fill this gap by following designs:
- Add
NativeStorageMemory
segment in vanillaStorageMemory
. We will have a configurationspark.memory.native.storageFraction
to define its size. Then we use this sizeoffheap.memory*spark.memory.storageFraction*spark.memory.native.storageFraction
to initializeAsyncDataCache
. - Add configuration
spark.memory.storage.preferSpillNative
to determine preference of spilling RDD cache or FileCache(Native) when storage memory should be shrinked. For example, when queries are mostly executed on same data sources, we prefer to keep native file cache. - Introduce
NativeMemoryStore
to provide similar interfaces as vanillaMemoryStore
and callAsyncDataCache::shrink
when eviction needed. - Introduce
NativeStorageMemoryAllocator
which is a memory allocator used for creatingAsyncDataCache
. It's wrapped with aReservationListener
to track the memory usage in native cache. -
VeloxBackend
initialization will be done w/o cache created. We will doVeloxBackend::setAsyncDatacache
when memory pools initializing.
The key code path will like following: