datafusion-ballista icon indicating copy to clipboard operation
datafusion-ballista copied to clipboard

Need clean up intermediate data in Ballista

Open Ted-Jiang opened this issue 3 years ago • 1 comments

Is your feature request related to a problem or challenge? Please describe what you are trying to do. We need to check whether the states saved in the sled is consumed by UI or not. if not consumed by UI, we can clean the job/task data when the SQL is finished.

If they are consumed by UI, we can choose either LRU based policy like Spark or time based eviction policy.

Regarding shuffle files, we also need to implement a way to clean them. This is a little bit complex because we need to clean up the files on all the hosts. ~~We might need to add new RPCs~~ .

Ted-Jiang avatar Jan 24 '22 09:01 Ted-Jiang

related to apache/arrow-ballista#7

Ted-Jiang avatar Feb 08 '22 05:02 Ted-Jiang