spiffe-helper icon indicating copy to clipboard operation
spiffe-helper copied to clipboard

Add a test suite for sidecar cmd and signals

Open ringerc opened this issue 11 months ago • 10 comments

Add test cover for:

  • long-running and short-lived cmd and cmd_args invocations
  • signalling via pid_file_name and renew_signal
  • windows cmd and cmd_args execution

If rebased onto the 'fix-windows' branch for windows test fixes, these pass for windows test runs too.

To make these tests possible the following changes are made to the sidecar code:

  • Sidecar.certReadyChan yields *workloadapi.X509Context not struct{} so the context can be observed. This could also be useful for code extending spiffe-helper.
  • A new Sidecar.cmdExitChan channel reports whenever a process managed by spiffe-helper exits, and the exit state. This allows tests to wait for process exits and see the exit code, etc.
  • A new Sidecar.pidFileSignalledChan channel reports whenever signalling the file in pid_file_name is attempted, the pid signalled, and the outcome. This allows signal based tests to see what spiffe-helper tried to signal.

Channels with values are used to ensure that there's no race between what might be present in the Sidecar struct state when the test observes it and what was there when the event of interest occurred.

It might be handy to have a channel for observing when the managed process is started too, but the tests just busy-wait polling Sidecar.processRunning to detect this for now.

I considered combining these channels into some kind of unified observer channel that emits a union struct depending on the event type, but it seemed pointless to do this when we can just select{} on multiple channels of interest and keep each channel simple.

Extracted from https://github.com/spiffe/spiffe-helper/pull/252, with linter complaints addressed, and the test for pid signalling adapted so it asserts that there are no retries.

ringerc avatar Feb 05 '25 00:02 ringerc

This requires a rebase now. It would've been great to merge these before making changes to the code these tests exercise, so the new changes could be covered by the tests I wrote.

ringerc avatar Feb 24 '25 04:02 ringerc

Fixed test compilation and execution. It seems to now run serialized for each sub-test though. Presumably an issue with the added mutex, or with my merge.

Will look into it. Some new linter complaints too, new linter config I think.

ringerc avatar Feb 24 '25 05:02 ringerc

Whew. That took a surprising amount of doing but I think the tests are better for it.

Note that these tests do not attempt to "shut down" the sidecar completely after each test. They just orphan it for the GC to deal with. I didn't want to introduce a bunch of extra logic that would only be required for testing a helper lifecycle that does not ever occur in real use.

ringerc avatar Feb 25 '25 00:02 ringerc

This requires a rebase now. It would've been great to merge these before making changes to the code these tests exercise, so the new changes could be covered by the tests I wrote.

Sorry for the delay in getting to this and for merging other things ahead. I missed this message when you first posted it. Will review this next.

faisal-memon avatar Feb 26 '25 23:02 faisal-memon

Thanks @faisal-memon

ringerc avatar Mar 02 '25 20:03 ringerc

After rebasing on top of the win32 fixes I see this failure when running this branch's make test-wine

➜  spiffe-helper git:(test-suite-for-cmd-and-signal-file) ✗ wineme make test-wine
GOOS=windows go test  -exec wine ./...

?   	github.com/spiffe/spiffe-helper/cmd/spiffe-helper	[no test files]
?   	github.com/spiffe/spiffe-helper/pkg/health	[no test files]
?   	github.com/spiffe/spiffe-helper/test/spiffetest	[no test files]
?   	github.com/spiffe/spiffe-helper/test/util	[no test files]
ok  	github.com/spiffe/spiffe-helper/cmd/spiffe-helper/config	2.225s
ok  	github.com/spiffe/spiffe-helper/pkg/disk	2.279s
hello
hello
hello
touch: cannot touch 'C:\users\craig\Temp\TestSignalProcess1559814469\001/1584742': No such file or directory
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x38 pc=0xbc4c41]

goroutine 12 [running]:
github.com/spiffe/spiffe-helper/pkg/sidecar.(*Sidecar).checkProcessExit(0xc000055980)
	/home/craig/upm/upm-stack/spiffe-helper/pkg/sidecar/sidecar.go:357 +0x101
created by github.com/spiffe/spiffe-helper/pkg/sidecar.(*Sidecar).signalProcess in goroutine 27
	/home/craig/upm/upm-stack/spiffe-helper/pkg/sidecar/sidecar.go:298 +0x26a
FAIL	github.com/spiffe/spiffe-helper/pkg/sidecar	1.200s
FAIL
make: *** [Makefile:222: test-wine] Error 1

Directory path-separator issue maybe. I'll look into it. It doesn't make sense to invoke touch on windows tests anyway, as when on real Windows not wine there won't be any such executable.

ringerc avatar Mar 02 '25 21:03 ringerc

I pushed a fix for the failure on rebase. The TestSignalProcess test didn't work on Windows/wine.

ringerc avatar Mar 02 '25 21:03 ringerc

@faisal-memon I've rebased this and fixed the DCO. Tests run on linux/amd64 and win32/amd64(wine).

➜  spiffe-helper git:(test-suite-for-cmd-and-signal-file) ✗ make test
go test  ./...
?   	github.com/spiffe/spiffe-helper/cmd/spiffe-helper	[no test files]
?   	github.com/spiffe/spiffe-helper/pkg/health	[no test files]
?   	github.com/spiffe/spiffe-helper/test/spiffetest	[no test files]
?   	github.com/spiffe/spiffe-helper/test/util	[no test files]
ok  	github.com/spiffe/spiffe-helper/cmd/spiffe-helper/config	(cached)
ok  	github.com/spiffe/spiffe-helper/pkg/disk	(cached)
wineok  	github.com/spiffe/spiffe-helper/pkg/sidecar	3.343s

➜  spiffe-helper git:(test-suite-for-cmd-and-signal-file) ✗ make test-wine
GOOS=windows go test  -exec wine ./...
?   	github.com/spiffe/spiffe-helper/cmd/spiffe-helper	[no test files]
?   	github.com/spiffe/spiffe-helper/pkg/health	[no test files]
?   	github.com/spiffe/spiffe-helper/test/spiffetest	[no test files]
?   	github.com/spiffe/spiffe-helper/test/util	[no test files]
ok  	github.com/spiffe/spiffe-helper/cmd/spiffe-helper/config	0.689s
ok  	github.com/spiffe/spiffe-helper/pkg/disk	2.120s
ok  	github.com/spiffe/spiffe-helper/pkg/sidecar	1.028s

ringerc avatar Mar 06 '25 20:03 ringerc

@ringerc Thanks for submitting these valuable tests and sorry for the delay in review. I added some comments. Please lmk when youve had a chance to look at them.

faisal-memon avatar Mar 13 '25 03:03 faisal-memon

@faisal-memon I think I've addressed your review comments. Thanks, that helped catch a mistake in my logic for test stdio handling.

ringerc avatar Mar 16 '25 22:03 ringerc

@ringerc I see this when running the tests. Is it expected?

=== RUN   TestSidecar_RunDaemon
=== RUN   TestSidecar_RunDaemon/svid_with_intermediate
    util_test.go:115: WARNING: sidecar process 34971 still running at end of test (&os.Process{Pid:34971, mode:0x0, state:atomic.Uint64{_:atomic.noCopy{}, _:atomic.align64{}, v:0x0}, sigMu:sync.RWMutex{w:sync.Mutex{_:sync.noCopy{}, mu:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:atomic.Int32{_:atomic.noCopy{}, v:0}, readerWait:atomic.Int32{_:atomic.noCopy{}, v:0}}, handle:0x0})
=== RUN   TestSidecar_RunDaemon/intermediate_in_bundle

    util_test.go:115: WARNING: sidecar process 34972 still running at end of test (&os.Process{Pid:34972, mode:0x0, state:atomic.Uint64{_:atomic.noCopy{}, _:atomic.align64{}, v:0x0}, sigMu:sync.RWMutex{w:sync.Mutex{_:sync.noCopy{}, mu:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:atomic.Int32{_:atomic.noCopy{}, v:0}, readerWait:atomic.Int32{_:atomic.noCopy{}, v:0}}, handle:0x0})
=== RUN   TestSidecar_RunDaemon/single_svid_with_intermediate_in_bundle

    util_test.go:115: WARNING: sidecar process 34973 still running at end of test (&os.Process{Pid:34973, mode:0x0, state:atomic.Uint64{_:atomic.noCopy{}, _:atomic.align64{}, v:0x0}, sigMu:sync.RWMutex{w:sync.Mutex{_:sync.noCopy{}, mu:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:atomic.Int32{_:atomic.noCopy{}, v:0}, readerWait:atomic.Int32{_:atomic.noCopy{}, v:0}}, handle:0x0})
=== RUN   TestSidecar_RunDaemon/single_svid
    util_test.go:115: WARNING: sidecar process 34974 still running at end of test (&os.Process{Pid:34974, mode:0x0, state:atomic.Uint64{_:atomic.noCopy{}, _:atomic.align64{}, v:0x0}, sigMu:sync.RWMutex{w:sync.Mutex{_:sync.noCopy{}, mu:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:atomic.Int32{_:atomic.noCopy{}, v:0}, readerWait:atomic.Int32{_:atomic.noCopy{}, v:0}}, handle:0x0})

=== RUN   TestSidecar_RunDaemon/single_svid_with_RenewSignal
    util_test.go:115: WARNING: sidecar process 34975 still running at end of test (&os.Process{Pid:34975, mode:0x0, state:atomic.Uint64{_:atomic.noCopy{}, _:atomic.align64{}, v:0x0}, sigMu:sync.RWMutex{w:sync.Mutex{_:sync.noCopy{}, mu:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:atomic.Int32{_:atomic.noCopy{}, v:0}, readerWait:atomic.Int32{_:atomic.noCopy{}, v:0}}, handle:0x0})

=== RUN   TestSidecar_RunDaemon/svid_with_federated_trust_domains
    util_test.go:115: WARNING: sidecar process 34976 still running at end of test (&os.Process{Pid:34976, mode:0x0, state:atomic.Uint64{_:atomic.noCopy{}, _:atomic.align64{}, v:0x0}, sigMu:sync.RWMutex{w:sync.Mutex{_:sync.noCopy{}, mu:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:atomic.Int32{_:atomic.noCopy{}, v:0}, readerWait:atomic.Int32{_:ato

faisal-memon avatar Apr 18 '25 22:04 faisal-memon

@ringerc I see this when running the tests. Is it expected?

=== RUN   TestSidecar_RunDaemon
=== RUN   TestSidecar_RunDaemon/svid_with_intermediate
    util_test.go:115: WARNING: sidecar process 34971 still running at end of test (&os.Process{Pid:34971, mode:0x0, state:atomic.Uint64{_:atomic.noCopy{}, _:atomic.align64{}, v:0x0}, sigMu:sync.RWMutex{w:sync.Mutex{_:sync.noCopy{}, mu:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:atomic.Int32{_:atomic.noCopy{}, v:0}, readerWait:atomic.Int32{_:atomic.noCopy{}, v:0}}, handle:0x0})
=== RUN   TestSidecar_RunDaemon/intermediate_in_bundle

    util_test.go:115: WARNING: sidecar process 34972 still running at end of test (&os.Process{Pid:34972, mode:0x0, state:atomic.Uint64{_:atomic.noCopy{}, _:atomic.align64{}, v:0x0}, sigMu:sync.RWMutex{w:sync.Mutex{_:sync.noCopy{}, mu:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:atomic.Int32{_:atomic.noCopy{}, v:0}, readerWait:atomic.Int32{_:atomic.noCopy{}, v:0}}, handle:0x0})
=== RUN   TestSidecar_RunDaemon/single_svid_with_intermediate_in_bundle

    util_test.go:115: WARNING: sidecar process 34973 still running at end of test (&os.Process{Pid:34973, mode:0x0, state:atomic.Uint64{_:atomic.noCopy{}, _:atomic.align64{}, v:0x0}, sigMu:sync.RWMutex{w:sync.Mutex{_:sync.noCopy{}, mu:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:atomic.Int32{_:atomic.noCopy{}, v:0}, readerWait:atomic.Int32{_:atomic.noCopy{}, v:0}}, handle:0x0})
=== RUN   TestSidecar_RunDaemon/single_svid
    util_test.go:115: WARNING: sidecar process 34974 still running at end of test (&os.Process{Pid:34974, mode:0x0, state:atomic.Uint64{_:atomic.noCopy{}, _:atomic.align64{}, v:0x0}, sigMu:sync.RWMutex{w:sync.Mutex{_:sync.noCopy{}, mu:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:atomic.Int32{_:atomic.noCopy{}, v:0}, readerWait:atomic.Int32{_:atomic.noCopy{}, v:0}}, handle:0x0})

=== RUN   TestSidecar_RunDaemon/single_svid_with_RenewSignal
    util_test.go:115: WARNING: sidecar process 34975 still running at end of test (&os.Process{Pid:34975, mode:0x0, state:atomic.Uint64{_:atomic.noCopy{}, _:atomic.align64{}, v:0x0}, sigMu:sync.RWMutex{w:sync.Mutex{_:sync.noCopy{}, mu:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:atomic.Int32{_:atomic.noCopy{}, v:0}, readerWait:atomic.Int32{_:atomic.noCopy{}, v:0}}, handle:0x0})

=== RUN   TestSidecar_RunDaemon/svid_with_federated_trust_domains
    util_test.go:115: WARNING: sidecar process 34976 still running at end of test (&os.Process{Pid:34976, mode:0x0, state:atomic.Uint64{_:atomic.noCopy{}, _:atomic.align64{}, v:0x0}, sigMu:sync.RWMutex{w:sync.Mutex{_:sync.noCopy{}, mu:sync.Mutex{state:0, sema:0x0}}, writerSem:0x0, readerSem:0x0, readerCount:atomic.Int32{_:atomic.noCopy{}, v:0}, readerWait:atomic.Int32{_:ato

No, nor do I observe it happening in my runs. Odd.

ringerc avatar Apr 21 '25 22:04 ringerc

No, nor do I observe it happening in my runs. Odd.

Ok fixed that issue, it was a race condition. We needed to retry checking the process exited.

faisal-memon avatar May 02 '25 04:05 faisal-memon