go-internal icon indicating copy to clipboard operation
go-internal copied to clipboard

testscript: should GOTRACEBACK be explicitly set in the environment?

Open myitcv opened this issue 1 year ago • 5 comments

In https://github.com/rogpeppe/go-internal/pull/171, the following change was made to explicitly set GOTRACEBACK in the testscript environment:

https://github.com/rogpeppe/go-internal/blob/9957a5216c116f4787d3b1afd8072405744804ee/testscript/testscript.go#L394

This mirrors a similar setting in upstream:

https://github.com/golang/go/blob/4a7f3ac8eb4381ea62caa1741eeeec28363245b4/src/cmd/go/script_test.go#L227C4-L227C11

It's not clear to me whether we need/want such a setting because the explicit setting of GOTRACEBACK causes trace output to differ from the Go default, the latter most likely being what the script author would expect.

The setting of the value in upstream is explained in the commit message for https://github.com/golang/go/commit/d83baa1aa22d074b44d8b705e1d8dafa30ecceb1:

This change also runs scripted tests with GOTRACEBACK=system, upgrading
from GOTRACEBACK=all. Often, script tests can trigger failures deep in
the runtime in interesting ways because they start many individual Go
processes, so being able to identify points of interest in the runtime
is quite useful.

I don't think that logic applies by default for users of testscript outside of the Go project.

Hence my suggestion would be that we remove this explicit setting, to restore the default behaviour of cmd/go.

cc @mvdan @FiloSottile

myitcv avatar Feb 10 '24 14:02 myitcv

Agreed. We should not be triggering or investigating runtime bugs.

mvdan avatar Feb 10 '24 14:02 mvdan

The default (single) is good for panics, but pretty useless for SIGQUIT since the signal handler is not the reason the program timed out.

I think I picked system by mistake though, it should be all.

FiloSottile avatar Feb 10 '24 15:02 FiloSottile

Using GOTRACEBACK=all seems fine to me as long as we document it :)

mvdan avatar Feb 10 '24 15:02 mvdan

I'll very much defer to others' greater experience/understanding here :)

Happy to do the change if someone can provide me with some good commentary on why we are using all

myitcv avatar Feb 10 '24 16:02 myitcv

My understanding is that, when a testscript panics, with GOTRACEBACK=single (the default) we only print the stack for the running goroutine that panicked, whereas with GOTRACEBACK=all we print all non-runtime goroutine stacks. That seems useful, particularly when a testscript has background commands running or other "service" goroutines like net/http/httptest. It's also more noisy, but in theory, panics should be rare.

I'm not sure what Filippo means with SIGQUIT; my understanding is that a SIGQUIT makes the Go runtime always print all goroutine stacks before exiting, presumably regardless of the GOTRACEBACK setting.

mvdan avatar Feb 10 '24 16:02 mvdan