apollo-ios
apollo-ios copied to clipboard
Msgs received on subscription with pending cancel cause unprocessedMessage error on remaining (uncancelled) subscription
Bug report
In a situation where...
- There are multiple subscriptions,
- One of the subscriptions is canceled, while the rest are allowed to continue, and
- a message is received on the subscription being canceled, before the server completes the cancel
...then a subscription that is not being cancelled receives an unprocessedMessage
error, for the message on the subscription that is being cancelled.
Versions
Please fill in the versions you're currently using:
-
apollo-ios
SDK version: 0.48.0, also did a quick check on 0.51 - Xcode version: 13.2.1
- Swift version: 5
- Package manager: SPM
Steps to reproduce
- Set up two subscriptions, SubscriptionA and SubscriptionB, using
public func subscribe<Subscription: GraphQLSubscription>(subscription: Subscription,
queue: DispatchQueue = .main,
resultHandler: @escaping GraphQLResultHandler<Subscription.Data>) -> Cancellable
- Cancel SubscriptionB using
public protocol Cancellable: AnyObject {
/// Cancel an in progress action.
func cancel()
}
- From the server, deliver a message for SubscriptionB ahead of the "complete" message for its cancellation (simulating a message in-flight at the time of the cancel)
- This should elicit an
unprocessedMessage
error on SubscriptionA
(Note: I tried to reproduce this in a simple example, but I can't get sample app at apollographql/iOSTutorial.git to connect to the server from apollographql/fullstack-tutorial.git, it reports "Invalid HTTP upgrade" every time, even after trying a lot of combinations of device vs. simulator, ws vs. wss, etc, etc... no luck getting it to connect. But that's a separate matter.)
Further details
This is happening with some regularity with our application and server because the usage pattern we have tends to produce a message on the subscription almost simultaneously with the act of cancelling it. It ends up being difficult to reliably distinguish this error from a "real" error on SubscriptionA, without inspecting the implementation details in the error message to establish that it's "just an unprocessed message meant for our cancelled SubscriptionB".
I've created a sample project at [link to apolloquestion0001 removed, outdated] that is able to replicate this issue fairly reliably.
It includes
- a simple GraphQL websocket server that supports two different subscriptions, and
- an iOS client that supports subscribing to both, and cancelling one of them.
To run the server, go to the apolloquestion0001/server directory and do the following:
nvm use v16.14.0
npm install @apollo/server express graphql cors body-parser apollo-server ws graphql-ws graphql-subscriptions
node app.mjs
To run the iOS client, open the client/apolloquestion0001/apolloquestion0001.xcodeproj, reset the package cache at File -> Packages -> Reset Package Caches, then run the app in the simulator. (The iOS app connects to localhost:4000/graphql so it expects the server to be running on the same machine as the simulator).
In the simulator, there will be a button to "Connect and Run Operations", which will start the two subscriptions. While receiving updates, click the "Cancel Thing2 subscription" button. A decent percentage of the time, this will result in this error, which will appear in the logs:
AppController: handleThing1Result network error WebSocketError(payload: Optional(["data": AnyHashable([AnyHashable("thing2Subscription"): AnyHashable([AnyHashable("__typename"): AnyHashable("Thing2"), AnyHashable("id"): AnyHashable("Thing2_1573714"), AnyHashable("description"): AnyHashable("Created via thing2Subscription")])])]), error: nil, kind: ApolloWebSocket.WebSocketError.ErrorKind.unprocessedMessage("{\"id\":\"2\",\"type\":\"next\",\"payload\":{\"data\":{\"thing2Subscription\":{\"__typename\":\"Thing2\",\"id\":\"Thing2_1573714\",\"description\":\"Created via thing2Subscription\"}}}}"))
If the error doesn't occur, click "Cancel operations and disconnect", then try again through "Connect and Run Operations".
The issue is that this error arrives on the subscription that was not cancelled, and we don't know what to do with it, or how to distinguish it from other errors.
More specifically it is an unprocessed "Thing 2" message error, arriving in the error handler for the "Thing 1" subscription, on the path described as "Network Errors" in the Apollo docs at https://www.apollographql.com/docs/ios/fetching/error-handling
In the example, I've handled it as follows:
func handleThing1Result(_ result: Result<GraphQLResult<Thing1Subscription.Data>, any Error>) {
switch result {
case let .success(graphQLResult):
//print("AppController: handleThing1Result success \(graphQLResult)")
thing1SubscriptionState = .success("\(graphQLResult.data?.thing1Subscription.id ?? "nil")")
case let .failure(e):
print("AppController: handleThing1Result network error \(e)")
thing1SubscriptionState = .networkFailure // SET BREAKPOINT HERE
}
}
(in client/apolloquestion0001/apolloquestion0001/ContentView.swift; you can set a breakpoint as indicated by the comment to catch the issue in action)
So the question is, what should we do at Thing1's case let .failure(e):
when we get an error about an unprocessed Thing2 message?
-
We could ignore it, but since (as far as I can tell) there isn't any publicly defined error code we can look at, there's no way we can distinguish ignorable errors from non-ignorable errors arriving here. I.e. We can't tell one type of error from another in
case let .failure(e):
so the only way we can ignore this specific error is to ignore all Network Errors. -
Or is it truly the case that all "Network Error" failures should be treated as recoverable, and we should just log them and/or ignore everything arriving via
case let .failure(e):
?
(Side note: the sample server is providing an artificially high rate of subscription updates in order to make the issue more likely to happen. Our real application doesn't update this frequently, but does intermittently see the same unprocessed message error on non-cancelled subscriptions).
Hi @wilsonmhpn 👋🏻 - thanks for the detailed reproduction case. We'll investigate and get back to you.
I'm also seeing this in my project with some regularity. It's not causing any real harm, but it would be nice to silence the error.