mars
mars copied to clipboard
[GRAPH][ENHANCEMENT] Optimize `SubtakGraph` generation
Problem
In gen_subtask_graph, Mars always create new out chunks even if the out chunk already exists. It costs a lot of time if there are plenty of chunks. Actually, there is no need to create out chunks except the FetchChunk.
I did a comparison, in which one creates new out chunks and the other does not. The result shows that the cost is reduced from 122.92s to 56.63s with about 53.93% reduction.
So we should optimize the SubtaskGraph generation and make it more efficient.