incubator-uniffle icon indicating copy to clipboard operation
incubator-uniffle copied to clipboard

[Bug] Incorrect stage retry condition for fetch failure

Open zuston opened this issue 1 year ago • 1 comments

Code of Conduct

Search before asking

  • [X] I have searched in the issues and found no similar issues.

Describe the bug

In current codebase, the retry will happen when the partitionId's failure reaches the spark.task.maxFailures. At the case of no-AQE, this is right. But for AQE, this suppose is wrong.

Affects Version(s)

master

Uniffle Server Log Output

No response

Uniffle Engine Log Output

No response

Uniffle Server Configurations

No response

Uniffle Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

zuston avatar Jun 17 '24 07:06 zuston

Could you help check this?

zuston avatar Jun 17 '24 07:06 zuston