spark
spark copied to clipboard
[SPARK-48751][INFRA][PYTHON][TESTS] Re-balance `pyspark-pandas-connect` tests on GA
What changes were proposed in this pull request?
The pr aims to re-balance
pyspark-pandas-connect
tests on GA
.
Why are the changes needed?
Make the execution cost time of pyspark-pandas-connect-part[0-3]
testing to a relatively average level, avoiding the occurrence of long tails and resulting in higher overall GA execution cost time.
Here are some currently observed examples:
-
https://github.com/apache/spark/pull/47135/checks?check_run_id=26784966983
Most of them are around
1 hour
, butpart2
cost1h 49m
,part3
cost2h 16m
-
https://github.com/panbingkun/spark/actions/runs/9693237300
Most of them are around
1 hour
, butpart2
cost1h 47m
,part3
cost2h 20m
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manually observing the cost time of pyspark-pandas-connect-part[0-3]
.
Was this patch authored or co-authored using generative AI tooling?
No.