fix(test): Wait for Vault server in watch tests
Fix flaky tests by polling Vault's /v1/sys/health endpoint. This ensures the server is ready before tests attempt to connect, preventing a race condition during Vault dev server startup.
Example of a failed run: https://github.com/hashicorp/consul-template/actions/runs/14620430375/job/41018777729?pr=2052
Breakdown of the issue in watch/watch_test.go:
-
maincallsnewTestVault()to start a Vault dev server usingexec.Command("vault", "server", "-dev", ...)and stores the command intestVault. - Immediately after
cmd.Start(),maincreates Vault clients and callsvaultTokenSetup(clients). -
vaultTokenSetupthen attempts to communicate with the Vault API throughvc.Sys().EnableAuthWithOptions(...), which fails with the connection refused error as the process is not in listening state yet.
Seems to affect the Enterprise tests more than the others.
You can simulate the test error locally by making the test launch an intermediate script which sleeps indefinitely.
I was able to dig out the following from Vault (Enterprise) stdout:
Error parsing listener configuration.
Error initializing listener of type tcp: listen tcp 127.0.0.1:8200: bind: address already in use
Looks like it needs to use ephemeral ports like the other tests. Will mark this as a draft for now.
Now watch package tests pass. However, all Consul Enterprise related tests still fail.
failed to start consul server: api unavailable
FAIL github.com/hashicorp/consul-template/dependency 2.154s
There's very little to debug due to consul output being omitted. My only hunch is that maybe the Consul Enterprise license in CONSUL_LICENSE env variable (passed from a Github secret) has been expired?
Maybe someone from HashiCorp could verify.