router icon indicating copy to clipboard operation
router copied to clipboard

WebSocket parsing issue for `connection_error` (`graphql_transport_ws` message)

Open bbil opened this issue 4 months ago • 0 comments

Describe the bug

Apollo Router tries to handle both WebSocket subprotocols using a single Rust enum -- ServerMessage. Handling parsing of WebSocket messages from Subgraph regardless of which subprotocol is being used.

#[derive(Deserialize, Serialize, Debug)]
#[serde(tag = "type", rename_all = "snake_case")]
pub(crate) enum ServerMessage {
    #[serde(alias = "connection_error")]
    Error {
        id: String,
        payload: ServerError,
    }
}

This ServerMessage::Error case is used to deserialize both type: error and type: connection_error messages. Based on the way it is written, both id and payload are required fields.

However, in the specification of graphql_transport_ws, the connection_error message is only described as having a payload field. Link to protocol.

To Reproduce

Steps to reproduce the behavior:

  1. Setup Subgraph to respond to connection_init message with connection_error
  2. Attempt to create a subscription through the router
  3. See error

Expected behavior

connection_error message with only payload should be able to be parsed correctly, and give an appropriate error message back to the Client initiating the subscription.

Output

Response from router

{
    "data": null,
    "errors": [
        {
            "message": "HTTP fetch failed from 'subgraph': Websocket fetch failed from 'subgraph': cannot get the GraphQL websocket stream: didn't receive the connection ack from websocket connection but instead got: Some(Err(Error(\"missing field `id`\", line: 0, column: 0)))",
            "path": [],
            "extensions": {
                "code": "SUBREQUEST_HTTP_ERROR",
                "service": "subgraph",
                "reason": "Websocket fetch failed from 'subgraph': cannot get the GraphQL websocket stream: didn't receive the connection ack from websocket connection but instead got: Some(Err(Error(\"missing field `id`\", line: 0, column: 0)))"
            }
        }
    ]
}

After ad-hoc patch

I modified Apollo Router to add #[serde(default)] to the id field, and this was the response I got, which now includes the error from the Subgraph, giving a much better idea to what the underlying issue is.

{
    "data": null,
    "errors": [
        {
            "message": "HTTP fetch failed from 'subgraph': Websocket fetch failed from 'subgraph': cannot get the GraphQL websocket stream: didn't receive the connection ack from websocket connection but instead got: Some(Ok(Error { id: \"\", payload: Error(Error { message: \"Unprocessable entity error, bad request.\", locations: [], path: None, extensions: {\"code\": String(\"UNPROCESSABLE_ENTITY\")} }) }))",
            "path": [],
            "extensions": {
                "code": "SUBREQUEST_HTTP_ERROR",
                "service": "subgraph",
                "reason": "Websocket fetch failed from 'subgraph': cannot get the GraphQL websocket stream: didn't receive the connection ack from websocket connection but instead got: Some(Ok(Error { id: \"\", payload: Error(Error { message: \"Unprocessable entity error, bad request.\", locations: [], path: None, extensions: {\"code\": String(\"UNPROCESSABLE_ENTITY\")} }) }))"
            }
        }
    ]
}

Desktop (please complete the following information):

Was testing with just Explorer interface, not Apollo Client.

Additional context

connection_error from Subgraph

{
    "payload": {
        "message": "Unprocessable entity error, bad request.",
        "extensions": {
            "code": "UNPROCESSABLE_ENTITY"
        }
    },
    "type": "connection_error"
}

bbil avatar Oct 10 '24 19:10 bbil