gotestsum tolerate `run` without `pass/failed` for benchmarks

Something in Go 1.20 changed so that there is a run event for benchmarks. As before, there is no pass or failed. This triggers the "test is running and thus must have failed" logic in gotestsum, which causes it to report the benchmark as failed.

As this is the behavior of Go (whether it's correct or not...), gotestsum should accept such output. To reduce the risk that genuine problems go undetected, such incomplete benchmarks get reported when a panic was detected (the original motivation for this workaround).

Related-to: https://github.com/gotestyourself/gotestsum/issues/413#issuecomment-2343206787

Sep 11 '24 12:09 pohly

Gentle ping... this is kind of urgent because of https://github.com/kubernetes/kubernetes/issues/127245.

Sep 16 '24 19:09 pohly

Thank you for the PR! And sorry for the delay. I have been thinking about this change, and I don't think it's quite right.

The problem is that benchmarks don't send run events, but this fix is ignoring failures, which seems like the wrong fix for the problem.

I think https://github.com/gotestyourself/gotestsum/pull/416 is the right fix for this. The problem is that we're adding an empty event into the running map, which then fails later on in this code you are changing. Instead of ignoring the failure, I think we need to fix the missing event.

Sep 16 '24 21:09 dnephin

I added a comment to that issue with some other ideas about a fix. I like your suggestion of looking at the exit status to figure out if we might have failing tests with missing end events, but I'm not sure yet exactly what that could should look like.

As a workaround, is it possible to run the benchmarks without gotestsum, or do they get run together with tests?

Sep 16 '24 21:09 dnephin

I think https://github.com/gotestyourself/gotestsum/pull/416 is the right fix for this.

I got odd results when I tried it - see https://github.com/gotestyourself/gotestsum/pull/416#issuecomment-2355159877.

As a workaround, is it possible to run the benchmarks without gotestsum, or do they get run together with tests?

The shell scripts which run the benchmarks don't know that they are running benchmarks :cry: It all depends on what's in the package and which parameters are passed through.

The workaround while we try to figure out a solution would be to revert to what Kubernetes was doing previously: run go test -json, capture output, then summarize with gobenchstat. It would still produce erroneous "Failed" output in the summary, but at least the jobs would be considered as passing again because the non-zero exit code of gobenchstat would get ignored.

Sep 17 '24 10:09 pohly

@dnephin: ping?

Another alternative solution would be a command line option like --pass-through-exit-code: if true (not the default), the result of the go test invocation is used as the exit code of gotestsum. That doesn't address the confusion when the output of gotestsum reports FAIL, but at least CI success/failure results are the same with and without gotestsum.

This might help with flakes, too. As I learned, go test -json doesn't really cause the test binary to write JSON. There's still parsing of text output involved. If the test binary writes additional text randomly (logging...), then there is a risk that parsing randomly fails. I don't know how likely that is, but I suppose it cannot be ruled out.

Oct 02 '24 06:10 pohly

Another alternative solution would be a command line option like --pass-through-exit-code:

That won't actually help. In contrast to what I expected, the exit code is zero already. It's the <failure message="Failed;" type=""> in the JUnit file which causes our CI to treat the job as failed, so we really have to focus on suppressing those.

Nov 08 '24 13:11 pohly