roryqi comments

Results 407 comments of


                                            roryqi

[#1579][part-1] fix(spark): Adjust reassigned time to ensure that all previous data is cleared for stage retry

It's dangerous to delete the failed data of the stage when we retry. It's hard to reach the condition to delete the data. We should rely on the data skip...

[#1579][part-1] fix(spark): Adjust reassigned time to ensure that all previous data is cleared for stage retry

> > It's dangerous to delete the failed data of the stage when we retry. It's hard to reach the condition to delete the data. > > Could you describe...

[#1579][part-1] fix(spark): Adjust reassigned time to ensure that all previous data is cleared for stage retry

@EnricoMi If we have the retry of stage, the taskId may not unique. Because we don't have stage attemptId to differ task 1 attempt 0 in the stage attempt 0...

[#1579][part-1] fix(spark): Adjust reassigned time to ensure that all previous data is cleared for stage retry

> > > > It's dangerous to delete the failed data of the stage when we retry. It's hard to reach the condition to delete the data. > > >...

[#1579][part-1] fix(spark): Adjust reassigned time to ensure that all previous data is cleared for stage retry

> @EnricoMi If we have the retry of stage, the taskId may not unique. Because we don't have stage attemptId to differ task 1 attempt 0 in the stage attempt...

[#1579][part-1] fix(spark): Adjust reassigned time to ensure that all previous data is cleared for stage retry

> Could you help review this? @EnricoMi @jerqi spark2 change will be finished after this PR is OK for you Several questions: 1. How to reject the legacy requests? 2....

[#1579][part-1] fix(spark): Adjust reassigned time to ensure that all previous data is cleared for stage retry

> > > Spark client can easily come up with a per-stage-attempt shuffle id and feed that to the shuffle server. That should not require any server-side refactoring. > >...

[#1579][part-1] fix(spark): Adjust reassigned time to ensure that all previous data is cleared for stage retry

> > Spark may compute partial tasks in a new attempt. > > You are saying a stage can be computed partially, let's say the first task and (if the...

[#1579][part-1] fix(spark): Adjust reassigned time to ensure that all previous data is cleared for stage retry

> > If we make the unique shuffleIdWithAttemptNo generated or converted in server side > > I presume the server side does not know about the stage attempt number, so...

[#1579][part-1] fix(spark): Adjust reassigned time to ensure that all previous data is cleared for stage retry

> > > > > If we make the unique shuffleIdWithAttemptNo generated or converted in server side > > > > > > > > > > > > I...