delta icon indicating copy to clipboard operation
delta copied to clipboard

[Infra] [WIP] Add test report listener to delta-spark and bucket test suites by estimated runtime

Open scottsand-db opened this issue 1 year ago • 2 comments

Which Delta project/connector is this regarding?

  • [ ] Spark
  • [ ] Standalone
  • [ ] Flink
  • [ ] Kernel
  • [X] Other (Infra)

Description

Adds a test report listener to delta-spark to record test runtimes. This listener logs test runtime metrics and writes metrics to a csv file (per CI shard, per attempt, per jvm, per thread) and uploads it as a test artifact.

We can use the output of this job to learn how to better bucket our slowest + largest tests.

https://github.com/delta-io/delta/actions/runs/11002528077?pr=3694

image

How was this patch tested?

Locally and GitHub CI.

Does this PR introduce any user-facing changes?

No.

scottsand-db avatar Sep 19 '24 20:09 scottsand-db

Another observation of mine: not sure if our hashing is even/fair enough.

The max difference between any two threads in a given shard is 68% (i.e. one thread has 68% more tests assigned to it than the other).

DSL: Scala 2.12.18, Shard 0

[info] Test group 0 contains 40 tests [info] Test group 1 contains 34 tests [info] Test group 2 contains 33 tests [info] Test group 3 contains 36 tests

DSL: Scala 2.12.18, Shard 1

[info] Test group 0 contains 24 tests [info] Test group 1 contains 37 tests [info] Test group 2 contains 35 tests [info] Test group 3 contains 28 tests

DSL: Scala 2.12.18, Shard 2

[info] Test group 0 contains 32 tests [info] Test group 1 contains 19 tests [info] Test group 2 contains 29 tests [info] Test group 3 contains 31 tests

scottsand-db avatar Sep 23 '24 17:09 scottsand-db

image

24 minute disparity between first and last

scottsand-db avatar Sep 23 '24 23:09 scottsand-db

Hi @scottsand-db, do you think we can change this to print the output when the tests finishes? So, we can merge the PR, monitor and proactively take action.

Also, I can't find the total number of tests anywhere, am I missing it? Recently I found we are running tests in duplicate, which is not a big deal, but if we don't track it, we could also miss tests.

Thanks.

felipepessoto avatar Nov 05 '25 07:11 felipepessoto