gotestsum icon indicating copy to clipboard operation
gotestsum copied to clipboard

`--rerun-fails` always makes goleak and datarace errors pass after re-running

Open odubajDT opened this issue 1 year ago • 10 comments

Description

When using --rerun-fails=1 option with gotestsum, all of the tests pass. The behavior is very deterministic. Instead when removing the option, I am getting a lot of goleak and data-race errors. It's quite interesting that the behavior with re-running the tests is very deterministic, as I would expect from a big number of tests and multiple runs, it should fail at least once also in the re-run. It seems to me that in the 2. run, the gotestsum just ignores the errors which popped up during the first run

odubajDT avatar Sep 23 '24 07:09 odubajDT

Hello, thank you for the bug report! Can you try with --debug? That will print the exact go test command that is run.

The re-runs are done one test at a time because of some limitations with the go test -run flag. Those limitations have since been fixed I believe, so it's possible we can switch to running one package at a time instead.

The tests may be passing because the data-race is due to a goroutine that is leaked from a previous test. When only a single test is run the data-race is gone. I'm less sure about the goleak passing. It's also possible the rerun is not setting the correct flags somehow, but we should see that with --debug.

dnephin avatar Sep 25 '24 03:09 dnephin

Thanks for the response!

first variant with --rerun-fails=1 --debug options:

gotestsum --rerun-fails=1 --debug --packages="./..." -- -race -timeout 600s -parallel 4 --tags=""
exec: [go test -json -race -timeout 600s -parallel 4 --tags= ./...]

rerun the same test:

exec: [go test -json -test.run=^$ -race -timeout 600s -parallel 4 --tags= github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awscontainerinsightreceiver/internal/cadvisor]

Full log:

/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/.tools/gotestsum --rerun-fails=1 --debug --packages="./..." -- -race -timeout 600s -parallel 4 --tags=""
exec: [go test -json -race -timeout 600s -parallel 4 --tags= ./...]
go test pid: 8235
go: downloading gopkg.in/evanphx/json-patch.v4 v4.12.0
✓  . (1.071s)
∅  internal/cadvisor/testutils
✖  internal/cadvisor (494ms)
✓  internal/cadvisor/extractors (1.108s)
✓  internal/ecsInfo (1.033s)
✓  internal/host (1.039s)
∅  internal/metadata
∅  internal/stores/kubeletutil
✓  internal/k8sapiserver (1.06s)
✓  internal/stores (1.072s)

DONE 98 tests, 1 failure in 74.185s

exec: [go test -json -test.run=^$ -race -timeout 600s -parallel 4 --tags= github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awscontainerinsightreceiver/internal/cadvisor]
go test pid: 15675
✓  internal/cadvisor (1.018s)

odubajDT avatar Sep 25 '24 06:09 odubajDT

On the other side, the same test without the --rerun-fails option

/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/.tools/gotestsum --debug --packages="./..." -- -race -timeout 600s -parallel 4 --tags=""
exec: [go test -json -race -timeout 600s -parallel 4 --tags= ./...]
go test pid: 8394
go: downloading gopkg.in/evanphx/json-patch.v4 v4.12.0
∅  internal/cadvisor/testutils
✖  internal/cadvisor (491ms)
✓  . (1.091s)
✓  internal/cadvisor/extractors (1.084s)
✓  internal/ecsInfo (1.039s)
✓  internal/host (1.043s)
∅  internal/metadata
∅  internal/stores/kubeletutil
✓  internal/k8sapiserver (1.058s)
✓  internal/stores (1.066s)

=== Failed
=== FAIL: internal/cadvisor  (0.00s)
PASS
goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 6 in state select, with github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep(0xc000059b20, 0xc00033ac20)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:142 +0x1b0
created by github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.NewMapWithExpiry in goroutine 5
make[2]: *** [../../Makefile.Common:130: test] Error 1
make[1]: *** [Makefile:187: receiver/awscontainerinsightreceiver] Error 2
make: *** [Makefile:127: gotest] Error 2
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:126 +0x265
 Goroutine 7 in state select, with github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep(0xc000059b60, 0xc00033ac40)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:142 +0x1b0
created by github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.NewMapWithExpiry in goroutine 5
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:126 +0x265
 Goroutine 8 in state select, with github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep(0xc000059bc0, 0xc00033ac50)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:142 +0x1b0
created by github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.NewMapWithExpiry in goroutine 5
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:126 +0x265
 Goroutine 9 in state select, with github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep(0xc000059c00, 0xc00033ac60)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:142 +0x1b0
created by github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.NewMapWithExpiry in goroutine 5
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:126 +0x265
 Goroutine 36 in state select, with github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep(0xc0001a4240, 0xc00018d700)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:142 +0x1b0
created by github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.NewMapWithExpiry in goroutine 35
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:126 +0x265
 Goroutine 37 in state select, with github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep(0xc0001a4280, 0xc00018d720)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:142 +0x1b0
created by github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.NewMapWithExpiry in goroutine 35
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:126 +0x265
 Goroutine 38 in state select, with github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep(0xc0001a42e0, 0xc00018d730)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:142 +0x1b0
created by github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.NewMapWithExpiry in goroutine 35
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:126 +0x265
 Goroutine 39 in state select, with github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep(0xc0001a4320, 0xc00018d740)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:142 +0x1b0
created by github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.NewMapWithExpiry in goroutine 35
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:126 +0x265
 Goroutine 44 in state select, with github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep(0xc0001a43a0, 0xc00018daf0)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:142 +0x1b0
created by github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.NewMapWithExpiry in goroutine 43
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:126 +0x265
 Goroutine 45 in state select, with github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep(0xc0001a43e0, 0xc00018db20)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:142 +0x1b0
created by github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.NewMapWithExpiry in goroutine 43
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:126 +0x265
 Goroutine 46 in state select, with github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep(0xc0001a4440, 0xc00018db40)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:142 +0x1b0
created by github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.NewMapWithExpiry in goroutine 43
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:126 +0x265
 Goroutine 47 in state select, with github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep on top of the stack:
github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.(*MapWithExpiry).sweep(0xc0001a4480, 0xc00018db60)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:142 +0x1b0
created by github.com/open-telemetry/opentelemetry-collector-contrib/internal/aws/metrics.NewMapWithExpiry in goroutine 43
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/aws/metrics/metric_calculator.go:126 +0x265
]
FAIL	github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awscontainerinsightreceiver/internal/cadvisor	0.491s

DONE 98 tests, 1 failure in 72.943s

odubajDT avatar Sep 25 '24 06:09 odubajDT

Full logs can be found as part of this PR https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/35413

I created it just for showcasing the problem

odubajDT avatar Sep 25 '24 06:09 odubajDT

Hello, any news about this?

odubajDT avatar Oct 14 '24 05:10 odubajDT

Hello, any news about this?

odubajDT avatar Nov 04 '24 13:11 odubajDT

Hello, any news about this?

odubajDT avatar Nov 15 '24 06:11 odubajDT

Hello, any news about this?

odubajDT avatar Jan 27 '25 08:01 odubajDT

I had not had much time to look into this problem.

What I do notice by looking at https://github.com/gotestyourself/gotestsum/issues/442#issuecomment-2373174041, is there's only one exec line. So the --rerun-fails isn't really doing anything here. It's failing on the first attempt.

dnephin avatar Feb 08 '25 21:02 dnephin

I guess the reason it passes on the second attempt is because tests are run in isolation. It means that the races and leaks are caused by multiple tests running together.

This is a limitation of --rerun-fails that I guess we can document. It does the re-runs as one test at a time.

dnephin avatar Feb 08 '25 21:02 dnephin