Gateway status at GUI sometimes doesn't sync properly
Hello,
From time to time i can see edge cases when gateway status at GUI being showed as disconnected, but gateway itself is connected to the core services. That happens randomly if for some reasons, gateway losses connection to the core service for some time. For example adding logs from the gateway:
Apr 15 14:20:01 bnk.labas.io defguard-gateway[1328483]: [2025-04-15T14:20:01Z INFO defguard_gateway::gateway] Connected to Defguard gRPC endpoint: https://defguard.labas.io:444/
Apr 15 23:40:50 bnk.labas.io defguard-gateway[1328483]: [2025-04-15T23:40:50Z ERROR defguard_gateway::gateway] Disconnected from Defguard gRPC endoint: https://defguard.labas.io:444/: status: Unknown, message: "h2 protocol error: error reading a body from connec>
Apr 15 23:40:50 bnk.labas.io defguard-gateway[1328483]: [2025-04-15T23:40:50Z ERROR defguard_gateway::gateway] Updates stream aborted; reconnecting
Apr 15 23:41:20 bnk.labas.io defguard-gateway[1328483]: [2025-04-15T23:41:20Z ERROR defguard_gateway::gateway] Couldn't retrieve gateway configuration from the core. Using gRPC URL: https://defguard.labas.io:444/. Retrying in 10s. Error: status: Unavailable, mes>
Apr 15 23:41:40 bnk.labas.io defguard-gateway[1328483]: [2025-04-15T23:41:40Z INFO defguard_gateway::gateway] Connected to Defguard gRPC endpoint: https://defguard.labas.io:444/
Meanwhile at the GUI, till now it shows as disconnected:
Everything works from the gateway perspective, users can connect to it. However i believe this status at the GUI should be synced also automagically after gateway gets connected to the core services?
If i restart defguard-gateway service manually, GUI status gets changed also to connected.
Core service version: 1.2.3 Proxy service version: 1.2.0 Gateway version: 1.2.1
Hi @NerijusRazvodovskis, Could you please try capturing logs from the core at the debug level as well? It would be helpful to see how the core is handling this case, especially since there are logs upon connect/disconnect events.
Thanks!
@filipslezaklab right, will try to do it, however it could take a while to replicate
@NerijusRazvodovskis i suspect that maybe gateway actually is not connected to core (even for a while - those connection losses may actually occur in a network?) - but gateway is designed to reconnect and ,hold' the status of peers so vpn will actually work..
@teon but at the end of the log it says that gateway connected itself to the core services (just right after the issues)
Apr 15 23:41:40 bnk.labas.io defguard-gateway[1328483]: [2025-04-15T23:41:40Z INFO defguard_gateway::gateway] Connected to Defguard gRPC endpoint: https://defguard.labas.io:444/
However, i will try to simulate this behaviour someday at next week. Already enabled debug mode on core services if it happens during the weekend.
Well sadly, i couldn't replicate it by hands, tried just to block connection (few times) to core/proxy services and later resume it. It was handled correctly, seems like only sometimes this edge case happens and status at GUI doesn't gets synced. Will try to test it out further and will update this issue when possible.
We had a similar behavior, that seemed to be triggered on our side when the ingress of the core was restarted (ingress nginx with GRPC, we know it's not officially supported here). The core logs didn't indicate that the gateway was reconnected, but the gateway were still working even with 2FA enabled. I didn't enable yet the debug logs, but in case this piece of information could help to replicate. I will try to replicate in the upcoming weeks on our side (and maybe get rid of the ingress for a sidecar with SSL offloading instead)
Closing the issue, we can't reproduce. If you have any more data/debug logs - please reopen.