opcua
opcua copied to clipboard
Possible flaw in autoreconnect logic
Related to https://github.com/influxdata/telegraf/issues/11559 and possibly related to https://github.com/gopcua/opcua/issues/597 and https://github.com/gopcua/opcua/issues/434.
The errors below are generated using the slightly modified read example here https://github.com/R290/opcua/tree/autoreconnect/examples/read with the docker-compose
in the same folder.
The output with debugging disabled:
Starting acquisition
time.Date(2022, time.August, 13, 8, 29, 21, 403337000, time.UTC)
Starting acquisition
time.Date(2022, time.August, 13, 8, 29, 31, 403580000, time.UTC)
When restarting the opcua container:
Starting acquisition
time.Date(2022, time.August, 13, 8, 30, 41, 724244000, time.UTC)
Starting acquisition
Read failed: The operation could not complete because the client is not connected to the server. StatusBadServerNotConnected (0x800D0000)
Starting acquisition
Read failed: The session id is not valid. StatusBadSessionIDInvalid (0x80250000)
Starting acquisition
Read failed: The session id is not valid. StatusBadSessionIDInvalid (0x80250000)
Starting acquisition
Read failed: The session id is not valid. StatusBadSessionIDInvalid (0x80250000)
Starting acquisition
Read failed: The session id is not valid. StatusBadSessionIDInvalid (0x80250000)
Starting acquisition
time.Date(2022, time.August, 13, 8, 31, 41, 715146000, time.UTC)
Starting acquisition
Read failed: The operation could not complete because the client is not connected to the server. StatusBadServerNotConnected (0x800D0000)
It does not recover cleanly.
Normal behavior with debugging enabled:
Starting acquisition
debug: uasc 1/9: send *ua.ReadRequest with 108 bytes
debug: uacp 1: recv MSGF with 86 bytes
debug: uasc 1/9: recv MSGF with 86 bytes
debug: uasc 1/9: recv *ua.ReadResponse
debug: uasc 1/9: sending *ua.ReadResponse to handler
time.Date(2022, time.August, 13, 8, 33, 40, 166392000, time.UTC)
When restarting the opcua container the debug console is flooded witht the following repeated message patten:
debug: client: monitor: disconnected
debug: client: monitor: auto-reconnecting
debug: client: monitor: action: createSecureChannel
debug: uacp 541: close
debug: publish: PublishResponse: {"ResponseHeader":{"Timestamp":"2022-08-13T08:33:58.30056Z","RequestHandle":9,"ServiceResult":2155413504,"ServiceDiagnostics":{"EncodingMask":0,"SymbolicID":0,"NamespaceURI":0,"Locale":0,"LocalizedText":0,"AdditionalInfo":"","InnerStatusCode":0,"InnerDiagnosticInfo":null},"StringTable":null,"AdditionalHeader":{"EncodingMask":0,"TypeID":{"NodeID":"i=0","NamespaceURI":"","ServerIndex":0},"Value":null}},"SubscriptionID":0,"AvailableSequenceNumbers":null,"MoreNotifications":false,"NotificationMessage":{"SequenceNumber":0,"PublishTime":"0001-01-01T00:00:00Z","NotificationData":null},"Results":null,"DiagnosticInfos":null}
debug: publish: error: no subscriptions but the publishing loop is still running: There is no subscription available for this session. StatusBadNoSubscription (0x80790000)
debug: sub: error: There is no subscription available for this session. StatusBadNoSubscription (0x80790000)
debug: sub: pause
debug: sub: pause: pause
debug: uasc 541: readChunk EOF
debug: uasc 541: Close()
Via Wireshark at localhost:3000 we can also see that the server is being swamped with packages (using filter tcp.port == 4840
). The reason Nginx is present is so we can simulate a dropped connection as well by restarting the nginx container. Behavior is similar to restarting the opcua server itself.
To me it seems that flooding the server with requests on a dropped connection is definitely not as designed...
same issue here. OPCUA Server gets flooded with wrong session ids.
Just checked with v0.5.2 and the flooding of the server no longer occurs. @Scaredon which version are you using? The read example at https://github.com/gopcua/opcua/blob/main/examples/read/read.go has also been modified to make a distinction between different error types.
@R290 I've retested today. newest telegraf plugin with v0.5.2 goopcua. Local Build. Autoreconnect is set to true.
- start opcuaserver
- start telegraf
- stop opcuaserver
- start opcuaserver
The server is spamed with the sequence (not sure if this is the correct order, log frequency is pretty high):
- Got an authentication request from the application 'Telegraf' with the TokenType 'ANONYMOUS_0'
- accept Anonymous authentication request
- reported: Failed to get session: id not found 404076375
Sadly, I'm unable to reproduce this. Could you provide any information about the server and authentication method you are using?