celeborn icon indicating copy to clipboard operation
celeborn copied to clipboard

[CELEBORN-1768][WRITER] Refactoring Shuffle Writer to extract common methods

Open zwangsheng opened this issue 1 year ago • 11 comments

What changes were proposed in this pull request?

Refactoring shuffleWriters, extract common methods as BasedShuffleWriter.

Why are the changes needed?

Currently, HashBasedShuffleWriter and SortBasedShuffleWriter have a lot of the same methods, which makes it difficult for us to maintain the code. In many scenarios, we need to update multiple classes at the same time.

Therefore, we need to refactor Shuffle Writer, extract common methods without changing the specific method logic, and simplify the corresponding Shuffle Writer.

Does this PR introduce any user-facing change?

No, we do not change the logic code of apis or others.

How was this patch tested?

UT and local env

zwangsheng avatar Dec 09 '24 12:12 zwangsheng

Cool. Do test this PR on the cluster, if something is missing, it will cause data correctness problems which are important.

FMX avatar Dec 09 '24 13:12 FMX

Cool. Do test this PR on the cluster, if something is missing, it will cause data correctness problems which are important.

Well, i'll test normal and abnormal cases in yarn cluster laster.

zwangsheng avatar Dec 09 '24 13:12 zwangsheng

@zwangsheng Hi, How are your tests?

FMX avatar Dec 16 '24 06:12 FMX

@zwangsheng Hi, How are your tests?

Tested on yarn cluster, passing normal Terasort tests, as well as abnormal random task failure tests.

But for the sake of correctness, I still need time to continue testing.

zwangsheng avatar Dec 16 '24 12:12 zwangsheng

Test the normal case(with terasort test) and the abnormal case(RDD with task fail), seems this change is safe.

@FMX plz take a review

zwangsheng avatar Dec 26 '24 05:12 zwangsheng

Test the normal case(with terasort test) and the abnormal case(RDD with task fail), seems this change is safe.

@FMX plz take a review

OK, please solve the conflicts. The review will be done soon.

FMX avatar Jan 03 '25 07:01 FMX

This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] avatar Jan 27 '25 08:01 github-actions[bot]

This issue was closed because it has been staled for 10 days with no activity.

github-actions[bot] avatar Feb 07 '25 08:02 github-actions[bot]

This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] avatar Feb 28 '25 08:02 github-actions[bot]

This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] avatar Mar 21 '25 08:03 github-actions[bot]

This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] avatar Apr 11 '25 08:04 github-actions[bot]

This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] avatar May 02 '25 08:05 github-actions[bot]

This issue was closed because it has been staled for 10 days with no activity.

github-actions[bot] avatar May 12 '25 08:05 github-actions[bot]

cc @zwangsheng any update?

turboFei avatar May 18 '25 14:05 turboFei

cc @zwangsheng any update?

Wrong push branch, open a new PR #3306

zwangsheng avatar Jun 05 '25 07:06 zwangsheng