agones icon indicating copy to clipboard operation
agones copied to clipboard

Flaky: TestSuperTuxKartGameServerReady

Open markmandel opened this issue 2 months ago • 8 comments

What happened:

e2e test TestSuperTuxKartGameServerReady is flake.

What's really weird, is that when it fails in one test, it continues to fail on subsequent tests. This also goes across standard and autopilot, although I think it happens more on autopilot.

Example: https://console.cloud.google.com/cloud-build/builds/b71c42ca-43da-429b-83b8-7a497d518708;step=1?project=agones-images

Here are the logs and the test output.

VERBOSE: === RUN   TestSuperTuxKartGameServerReady
VERBOSE: === PAUSE TestSuperTuxKartGameServerReady
VERBOSE: === CONT  TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:08.178" level=info msg="GameServer created, waiting for Ready" gs=supertuxkart-4cchc test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:08.282" level=info msg="Waiting for states to match" awaitingState=Ready currentState=Starting gs=supertuxkart-4cchc test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:09.261" level=info msg="Waiting for states to match" awaitingState=Ready currentState=Starting gs=supertuxkart-4cchc test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:10.257" level=info msg="Waiting for states to match" awaitingState=Ready currentState=Scheduled gs=supertuxkart-4cchc test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:11.258" level=info msg="Waiting for states to match" awaitingState=Ready currentState=Scheduled gs=supertuxkart-4cchc test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:12.261" level=info msg="Waiting for states to match" awaitingState=Ready currentState=Scheduled gs=supertuxkart-4cchc test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:13.268" level=info msg="Waiting for states to match" awaitingState=Ready currentState=Scheduled gs=supertuxkart-4cchc test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:14.266" level=info msg="Waiting for states to match" awaitingState=Ready currentState=Scheduled gs=supertuxkart-4cchc test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:15.264" level=info msg="Waiting for states to match" awaitingState=Ready currentState=Scheduled gs=supertuxkart-4cchc test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:16.259" level=info msg="Waiting for states to match" awaitingState=Ready currentState=Scheduled gs=supertuxkart-4cchc test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.259" level=error msg="GameServer reached terminal state" awaitingState=Ready currentState=Unhealthy gs=supertuxkart-4cchc test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.261" level=info msg="Game Server Events:" test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.262" level=info msg="Dumping Events:" kind= test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.368" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:08 +0000 UTC" message="Port allocated" reason=PortAllocation test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.369" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:08 +0000 UTC" message="Pod supertuxkart-4cchc created" reason=Creating test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.369" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:09 +0000 UTC" message="Address and port populated" reason=Scheduled test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.369" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:16 +0000 UTC" message="Issue with Gameserver pod" reason=Unhealthy test=TestSuperTuxKartGameServerReady type=Warning
VERBOSE: time="2025-10-31 20:43:17.369" level=info msg="Game Server Pod Events:" test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.471" level=info msg="Dumping Events:" kind= test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.570" level=info msg="Event!" lastTimestamp="0001-01-01 00:00:00 +0000 UTC" message="Successfully assigned 1761943387/supertuxkart-4cchc to gke-standard-e2e-test-cluster-default-7eff966d-43v7" reason=Scheduled test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.570" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:08 +0000 UTC" message="Pulling image \"us-docker.pkg.dev/agones-images/ci/agones-sdk:1.54.0-dev-76f28ef\"" reason=Pulling test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.571" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:09 +0000 UTC" message="Successfully pulled image \"us-docker.pkg.dev/agones-images/ci/agones-sdk:1.54.0-dev-76f28ef\" in 262ms (262ms including waiting). Image size: 33754948 bytes." reason=Pulled test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.571" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:09 +0000 UTC" message="Created container: agones-gameserver-sidecar" reason=Created test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.571" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:09 +0000 UTC" message="Started container agones-gameserver-sidecar" reason=Started test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.571" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:09 +0000 UTC" message="Pulling image \"us-docker.pkg.dev/agones-images/examples/supertuxkart-example:0.19\"" reason=Pulling test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.572" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:10 +0000 UTC" message="Successfully pulled image \"us-docker.pkg.dev/agones-images/examples/supertuxkart-example:0.19\" in 1.125s (1.125s including waiting). Image size: 827257650 bytes." reason=Pulled test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.572" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:10 +0000 UTC" message="Created container: supertuxkart" reason=Created test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.572" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:10 +0000 UTC" message="Started container supertuxkart" reason=Started test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.572" level=info msg="Event!" lastTimestamp="2025-10-31 20:43:13 +0000 UTC" message="Stopping container agones-gameserver-sidecar" reason=Killing test=TestSuperTuxKartGameServerReady type=Normal
VERBOSE: time="2025-10-31 20:43:17.572" level=info msg="Game Server Container Logs:" test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.572" level=info msg="Logs for Pod:" namespace=1761943387 pod=supertuxkart-4cchc test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.695" level=info msg="---Logs for container---" options="&PodLogOptions{Container:supertuxkart,Follow:false,Previous:false,SinceSeconds:nil,SinceTime:<nil>,Timestamps:false,TailLines:nil,LimitBytes:nil,InsecureSkipTLSVerifyBackend:false,Stream:nil,}" test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.696" level=info msg="[wrapper] 2025/10/31 20:43:10 Connecting to Agones with the SDK" options="&PodLogOptions{Container:supertuxkart,Follow:false,Previous:false,SinceSeconds:nil,SinceTime:<nil>,Timestamps:false,TailLines:nil,LimitBytes:nil,InsecureSkipTLSVerifyBackend:false,Stream:nil,}" test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.696" level=info msg="[wrapper] 2025/10/31 20:43:10 Starting health checking" options="&PodLogOptions{Container:supertuxkart,Follow:false,Previous:false,SinceSeconds:nil,SinceTime:<nil>,Timestamps:false,TailLines:nil,LimitBytes:nil,InsecureSkipTLSVerifyBackend:false,Stream:nil,}" test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.696" level=info msg="[wrapper] 2025/10/31 20:43:10 Starting wrapper for SuperTuxKart" options="&PodLogOptions{Container:supertuxkart,Follow:false,Previous:false,SinceSeconds:nil,SinceTime:<nil>,Timestamps:false,TailLines:nil,LimitBytes:nil,InsecureSkipTLSVerifyBackend:false,Stream:nil,}" test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.696" level=info msg="[wrapper] 2025/10/31 20:43:10 Command being run for SuperTuxKart server: /usr/games/supertuxkart --server-config=/home/supertuxkart/server_config.xml " options="&PodLogOptions{Container:supertuxkart,Follow:false,Previous:false,SinceSeconds:nil,SinceTime:<nil>,Timestamps:false,TailLines:nil,LimitBytes:nil,InsecureSkipTLSVerifyBackend:false,Stream:nil,}" test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.697" level=info msg="2025/10/31 20:43:11 Waiting for /home/supertuxkart/.config/supertuxkart/config-0.10/server_config.log to appear..." options="&PodLogOptions{Container:supertuxkart,Follow:false,Previous:false,SinceSeconds:nil,SinceTime:<nil>,Timestamps:false,TailLines:nil,LimitBytes:nil,InsecureSkipTLSVerifyBackend:false,Stream:nil,}" test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.697" level=info msg="[wrapper] 2025/10/31 20:43:11 tail ended" options="&PodLogOptions{Container:supertuxkart,Follow:false,Previous:false,SinceSeconds:nil,SinceTime:<nil>,Timestamps:false,TailLines:nil,LimitBytes:nil,InsecureSkipTLSVerifyBackend:false,Stream:nil,}" test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.697" level=info msg="---End of container logs---" options="&PodLogOptions{Container:supertuxkart,Follow:false,Previous:false,SinceSeconds:nil,SinceTime:<nil>,Timestamps:false,TailLines:nil,LimitBytes:nil,InsecureSkipTLSVerifyBackend:false,Stream:nil,}" test=TestSuperTuxKartGameServerReady
VERBOSE: time="2025-10-31 20:43:17.788" level=warning msg="Error opening log stream for container" error="previous terminated container \"supertuxkart\" in pod \"supertuxkart-4cchc\" not found" options="&PodLogOptions{Container:supertuxkart,Follow:false,Previous:true,SinceSeconds:nil,SinceTime:<nil>,Timestamps:false,TailLines:nil,LimitBytes:nil,InsecureSkipTLSVerifyBackend:false,Stream:nil,}" test=TestSuperTuxKartGameServerReady
VERBOSE:     examples_test.go:93: 
VERBOSE:         	Error Trace:	/go/src/agones.dev/agones/test/e2e/examples_test.go:93
VERBOSE:         	Error:      	Received unexpected error:
VERBOSE:         	            	waiting for {supertuxkart [{default  Dynamic <nil> 8080 0 UDP}] {false 10 0 120}  { 0 0} {{      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []} {[] [] [{supertuxkart us-docker.pkg.dev/agones-images/examples/supertuxkart-example:0.19 [] []  [] [] [{ENABLE_PLAYER_TRACKING false nil}] {map[] map[cpu:{{500 -3} {<nil>} 500m DecimalSI} memory:{{1073741824 0} {<nil>} 1Gi BinarySI}] []} [] <nil> [] [] nil nil nil nil    nil false false false}] []  <nil> <nil>  map[]   <nil>  false false false <nil> nil []   nil  [] []  <nil> nil [] <nil> <nil> <nil> map[] [] <nil> nil <nil> [] [] nil}} <nil> map[] map[] <nil>} GameServer instance readiness timed out (): waiting for GameServer 1761943387/supertuxkart-4cchc to be Ready: GameServer reached terminal state Unhealthy
VERBOSE:         	Test:       	TestSuperTuxKartGameServerReady
VERBOSE:         	Messages:   	Ready game server timed out
VERBOSE: --- FAIL: TestSuperTuxKartGameServerReady (9.80s)
VERBOSE: FAIL
VERBOSE: time="2025-10-31 20:43:18.006" level=info msg="Namespace 1761943387 is deleted"
VERBOSE: FAIL	agones.dev/agones/test/e2e	11.421s

Looking at the above, it seems like the log file may not get created?

What you expected to happen:

Should pass consistently.

How to reproduce it (as minimally and precisely as possible):

Happens randomly in e2e tests. Hard to reproduce.

Anything else we need to know?:

Environment: dev

  • Agones version: dev
  • Kubernetes version (use kubectl version): all?
  • Cloud provider or hardware configuration: GKE standard and Autopilot.
  • Install method (yaml/helm): helm
  • Troubleshooting guide log(s): see above.
  • Others:

markmandel avatar Oct 31 '25 21:10 markmandel

@markmandel can i take up this issue i am interested

harshasiddartha avatar Nov 01 '25 08:11 harshasiddartha

@markmandel can i take up this issue i am interested

Go for it! 👍🏻

markmandel avatar Nov 01 '25 15:11 markmandel

There may also be a newer version of STK - might be worth upgrading at the same time? 🤔

markmandel avatar Nov 01 '25 15:11 markmandel

This could be a relatively simple fix of increasing the wait time. Since the sidecar feature is on that could explain the increase in start time needed.

https://github.com/googleforgames/agones/blob/47fa016a616ff8c60e67b739de63e5513efb07d0/examples/supertuxkart/main.go#L83-L85

igooch avatar Nov 03 '25 19:11 igooch

Another one! https://console.cloud.google.com/cloud-build/builds/a8d3ac8c-7b74-4210-a167-961c4a32ca9d;step=1?project=agones-images

markmandel avatar Nov 18 '25 18:11 markmandel

This could be a relatively simple fix of increasing the wait time. Since the sidecar feature is on that could explain the increase in start time needed.

agones/examples/supertuxkart/main.go

Lines 83 to 85 in 47fa016

// Loop to make sure the log has been created. Sometimes it takes a few seconds for i := 0; i < 10; i++ { time.Sleep(time.Second)

@Sivasankaran25 or @indurireddy-TF could you create a PR increasing the timeout? This may or may not fix the issue, but it's worth trying before making more substantial changes.

igooch avatar Nov 18 '25 19:11 igooch

I've already got it - PR incoming shortly 👍🏻

markmandel avatar Nov 18 '25 21:11 markmandel

Haven't seen this in a while... dare I close this ticket?

markmandel avatar Dec 09 '25 22:12 markmandel

It's been quite some time, which is great ! I'm fine to close it, let's hope it doesn't come back 🤞🏼 !

lacroixthomas avatar Dec 17 '25 23:12 lacroixthomas

CLOSING! 😱

markmandel avatar Dec 17 '25 23:12 markmandel