Gabor Retvari comments

Results 105 comments of


                                            Gabor Retvari

TURN connection breaks when the backend pod enters graceful shutdown

This ends up being two related bugs, both occurring due to that we immediately remove the pod IP from the `stunnerd` config when the pod enters the terminating state, instead...

TURN connection breaks when the backend pod enters graceful shutdown

Further investigations: turns out the problem is that we're still using the old Endpoints API for backend pod discovery, which does *not* consider terminating pods, in contrast to the modern...

TURN connection breaks when the backend pod enters graceful shutdown

Addendum: with 2 "ready" and one "terminating" pods: ``` media-plane-55658cb4f5-hdw6c 1/1 Running 0 10.244.0.14 media-plane-55658cb4f5-pjp9c 1/1 Terminating 0 10.244.0.12 media-plane-55658cb4f5-vjvnz 1/1 Running 0 10.244.0.13 ``` We get this EndpointSlice: ```...

TURN connection breaks when the backend pod enters graceful shutdown

I've tested this and the issue can no longer be reproduced with the new STUNner `dev` version that uses the EndpointSlice controller. 1. Fire up the UDP greeter example again...

TURN connection breaks when the backend pod enters graceful shutdown

I'm not hundred percent sure I get the problem right. Do you terminate a **backend pod** and this makes your connections break, or do you terminate a **stunnerd pod** and...

TURN connection breaks when the backend pod enters graceful shutdown

I'm afraid you're using graceful shutdown incorrectly. The idea is that as long as there are active TURN allocations STUNner will refuse to shut down even if the `stunnerd` pod...

TURN connection breaks when the backend pod enters graceful shutdown

Now we're talking!...:-) Interesting. Is this AWS/EKS? Anyway, by design STUNner *does* accept new connections during graceful shutdown exactly to address buggy/slow load balancers. This makes the prestop hook unnecessary....

TURN connection breaks when the backend pod enters graceful shutdown

We're so grateful you tested this! Graceful shutdown is one of the most critical features a cloud-native software must provide but getting this right, especially for gateways, is quite difficult....

SRS integration?

Also, the STUNner [scaling guide](https://github.com/l7mp/stunner/blob/main/docs/SCALING.md) might be of interest here.

SRS integration?

This issue has been stale for a couple of months, closing it for now. Feel free to reopen if anything new comes up.