mercurius-gateway icon indicating copy to clipboard operation
mercurius-gateway copied to clipboard

Load balancer must check if the server in the provided pool is online before sending a request

Open SiNONiMiTY opened this issue 2 years ago • 4 comments

Title says

I am encountering a scenario where I provide 2 URLs for a single subgraph in an array form

const gateway = Fastify()
gateway.register(mercuriusGateway, {
    gateway: {
        services: [
            {
                "name": "user",
                "url": [
                        "http://endpoint1:4001/graphql",
                        "http://endpoint2:4001/graphql"
                ],
                "schema": "type Query { id: ID }"
            }
        ]
    }
})

endpoint2 is intentionally taken down and only endpoint1 is working, however, when sending queries on the gateway, I am occassionally receiving errors about ECONNREFUSED on endpoint2.

The load balancing mechanism should first do a test ping if the host is reachable before sending a request.

SiNONiMiTY avatar Feb 21 '23 04:02 SiNONiMiTY

Unfortunately it's a bit more complex than sending a "ping", as those errors come from existing sockets that are truncated.

How are you shutting down your upstreams servers? Are they closing gracefully or are they crashing?

mcollina avatar Feb 21 '23 08:02 mcollina

Unfortunately it's a bit more complex than sending a "ping", as those errors come from existing sockets that are truncated.

How are you shutting down your upstreams servers? Are they closing gracefully or are they crashing?

Starting the gateway with only one online subgraph out of the two provided

SiNONiMiTY avatar Feb 21 '23 08:02 SiNONiMiTY

Thanks, that helps!

I think there is a bug in undici BalancedPool that routes requests to an upstream even if it could not connect there, and it does not retry/send it elsewhere in case it fails to connect. Things stabilizes over time because of BalancedPool algorithm, so only a few number of requests would fail.

The bad news is that I don't have time right now to fix it there.

mcollina avatar Feb 21 '23 08:02 mcollina

Thanks, that helps!

I think there is a bug in undici BalancedPool that routes requests to an upstream even if it could not connect there, and it does not retry/send it elsewhere in case it fails to connect. Things stabilizes over time because of BalancedPool algorithm, so only a few number of requests would fail.

The bad news is that I don't have time right now to fix it there.

Yes! I noticed that the balancing algorithm eventually only selects the online server after sending some requests.

SiNONiMiTY avatar Feb 21 '23 08:02 SiNONiMiTY