opcua-asyncio
opcua-asyncio copied to clipboard
Memory balloons after "Subscription Subscription(id:<id>) has expired."
We've noticed that in our production environment, where we run multiple instances (19 right now) of asyncua-based OPC UA servers, some instances enter some strange mode from which recovery is difficult. A symptom of this runaway mode is memory ballooning until the limit is reached.
From the moment the issue manifests itself, the logs are blasted with the following statements, following "the session id is not valid" error:
2019-12-17 15:26:46 [ERROR] asyncua.server.uaprocessor - sending service fault response: The session id is not valid. (BadSessionIdInvalid)
2019-12-17 15:31:25 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:25 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:25 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:25 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:25 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:25 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:26 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:26 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:26 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:26 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:26 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:26 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:26 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:26 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:26 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:26 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:27 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:27 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:27 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:27 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:27 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:27 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:27 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:27 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(101) > lifetime count (100)
2019-12-17 15:31:27 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(102) > lifetime count (100)
2019-12-17 15:31:27 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(102) > lifetime count (100)
2019-12-17 15:31:28 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(102) > lifetime count (100)
2019-12-17 15:31:28 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:86) has expired, publish cycle count(102) > lifetime count (100)
2019-12-17 15:31:28 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:93) has expired, publish cycle count(102) > lifetime count (100)
...
Sometimes, when this happens, nothing else gets logged except that statement, in the thousands. The above log goes on to "publish cycle count" 2320 and beyond:
2019-12-17 17:01:16 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:101) has expired, publish cycle count(2320) > lifetime count (100)
We've noticed serious connectivity issues on the production network, which seem to trigger this phenomenon. The "The session id is not valid" error indicates an OPC UA client is reconnecting on account of a connectivity dropout.
Any ideas what might happen here at the subscription level from the perspective of the asyncua library?
thanks for the report. I have not read that code for a long time but I think there has been a PR a few week ago, can you try on master? or try an older version to see if that happens?
and great that you report that kind of things, this is hard to test since it never happens in short testing env.
I looked at spec:
ubscriptions have a lifetime counter that counts the number of consecutive publishing cycles in which there have been no Publish requests available to send a Publish response for the Subscription. Any Service call that uses the SubscriptionId or the processing of a Publish response resets the lifetime counter of this Subscription. When this counter reaches the value calculated for the lifetime of a Subscription based on the MaxKeepAliveCount parameter in the CreateSubscription Service (5.13.2), the Subscription is closed. Closing the Subscription causes its MonitoredItems to be deleted. In addition the Server shall issue a StatusChangeNotification notificationMessage with the status code Bad_Timeout. The StatusChangeNotification notificationMessage type is defined in 7.20.4.
and code where there is huge FIXME
https://github.com/FreeOpcUa/opcua-asyncio/blob/master/asyncua/server/internal_subscription.py#L100
It looks like we have stopped receiving publishingrequests from server and that we should close the subscription but we do not... PR welcome
but this is the side effect of the first one which is a connection issue. (anyway it explains the memory ballons since we just fill up memory with events that should be sent to server and that are not)
I've baselined our code on the version corresponding to the commit "Fix Issue #108," available since about 18 days ago.
But the version where we have the above issue, is much older. It dates back to "Fix #71." We're planning a new deployment soon.
In this particular extreme case, where the network goes haywire, I assume the OPC UA clients and servers are seeing disconnects constantly and are desperately attempting to restore the connection.
I merged a fix for session renewal. You can also try it. It might solve the fundamental issue in your case
I've also been running into this recently. My scenario is I have both a asyncua
client and server. The connection between the two is occasionally unstable, and thus when the client disconnects (and closes the session), the request to delete all of the client's subscriptions here never actually makes it to the server. The subscription results still queue up here which I believe is where the memory is leaking. If I read the spec correctly, the subscription should be deleted after going over the lifetime limit, but as far as I can tell it does not, which means memory will continue to be leaked until the program crashes.
For now, I can rely on my client auto-reconnecting to the server, so I simply delete all of the old subscriptions upon reconnection, but I think the long term fix might be deleting the subscriptions from the server side once they expire.
I found several of our OPC UA servers frantically logging the expiry notification again. I updated the servers to version 1.0.4, only a few weeks back. The last update goes back to December 2022, when I bumped it up to 1.0.1.
2023-09-06 14:51:10,207 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055148) > lifetime count (12000)
2023-09-06 14:51:10,307 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055149) > lifetime count (12000)
2023-09-06 14:51:10,407 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055150) > lifetime count (12000)
2023-09-06 14:51:10,506 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055151) > lifetime count (12000)
2023-09-06 14:51:10,607 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055152) > lifetime count (12000)
2023-09-06 14:51:10,707 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055153) > lifetime count (12000)
2023-09-06 14:51:10,807 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055154) > lifetime count (12000)
2023-09-06 14:51:10,906 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055155) > lifetime count (12000)
2023-09-06 14:51:11,007 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055156) > lifetime count (12000)
2023-09-06 14:51:11,107 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055157) > lifetime count (12000)
2023-09-06 14:51:11,207 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055158) > lifetime count (12000)
2023-09-06 14:51:11,307 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055159) > lifetime count (12000)
2023-09-06 14:51:11,406 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055160) > lifetime count (12000)
2023-09-06 14:51:11,508 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055161) > lifetime count (12000)
2023-09-06 14:51:11,606 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055162) > lifetime count (12000)
2023-09-06 14:51:11,706 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055163) > lifetime count (12000)
2023-09-06 14:51:11,807 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055164) > lifetime count (12000)
2023-09-06 14:51:11,907 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055165) > lifetime count (12000)
2023-09-06 14:51:12,007 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055166) > lifetime count (12000)
2023-09-06 14:51:12,106 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055167) > lifetime count (12000)
2023-09-06 14:51:12,206 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055168) > lifetime count (12000)
2023-09-06 14:51:12,306 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055169) > lifetime count (12000)
2023-09-06 14:51:12,406 [WARNING] asyncua.server.internal_subscription - Subscription Subscription(id:85) has expired, publish cycle count(1055170) > lifetime count (12000)
I will keep the servers under observation; is there anything I can do to get a better understanding of the root cause of this issue?
I am facing the same issues. Did anyone manage to get a rid of this yet?
We are encountering the same issue that was reported here. Has anyone managed to resolve it?