spark [SPARK-48751][INFRA][PYTHON][TESTS] Re-balance `pyspark-pandas-connect` tests on GA

[SPARK-48751][INFRA][PYTHON][TESTS] Re-balance `pyspark-pandas-connect` tests on GA

Open panbingkun opened this issue 7 months ago • 3 comments

What changes were proposed in this pull request?

The pr aims to re-balance pyspark-pandas-connect tests on GA.

Why are the changes needed?

Make the execution cost time of pyspark-pandas-connect-part[0-3] testing to a relatively average level, avoiding the occurrence of long tails and resulting in higher overall GA execution cost time.

Here are some currently observed examples:

https://github.com/apache/spark/pull/47135/checks?check_run_id=26784966983

Most of them are around 1 hour, but part2 cost 1h 49m, part3 cost 2h 16m
https://github.com/panbingkun/spark/actions/runs/9693237300 Most of them are around 1 hour, but part2 cost 1h 47m, part3 cost 2h 20m

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manually observing the cost time of pyspark-pandas-connect-part[0-3].

Was this patch authored or co-authored using generative AI tooling?

No.

Jun 28 '24 02:06 panbingkun

spark spark copied to clipboard

[SPARK-48751][INFRA][PYTHON][TESTS] Re-balance `pyspark-pandas-connect` tests on GA

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

spark
spark copied to clipboard