Junfan Zhang
Junfan Zhang
Could you help review this? @advancedxy
> Thanks for working on this. I did a quick overview about this change, I think it's quite large to review. It would best to keep this pr open and...
Good to know this. Thanks for your determined reply. @advancedxy
> However, I think this PR should be split into two PRs at least: handle fetch failure and handle write failures. I think the write failure is different from fetch...
After digging into this feature, I found there are many bugs and improvement need to be done. So I have to split them into small patch to fix. And this...
> Will this issue lead to some more serious problems? Noop. Just remains some unnecessary data, but this is critial for stage retry
cc @maobaolong
> > The stage retry whole design is based on the Spark's fetchFailedException , once spark scheduler accepts this exception thrown from task, it will kill all the running tasks...
To ensure the data consistency, I will support the original unique task attempt id in blockId layout for the stage retry to filter out the previous stage data. PTAL @EnricoMi
I don't find out any concept that the limit operator will get the stage attempt 0 + stage attempt 1 data to be as the normal data pipeline. Could you...