iothub
iothub copied to clipboard
AMQP Link Detach
I have an issue where I'm unable to publish events. Unfortunately I can't identify any more related circumstances than that. It has occurred some times, but in most cases it works as expected.
In essence the code works as follows:
ctx := context.Background()
message := []byte("Hello, World!")
expiry := 10 *60 * time.Second
deviceId := "some-device"
if err := client.SendEvent(
ctx,
deviceId,
message,
iotservice.WithSendAck((iotservice.AckType)("full")),
iotservice.WithSendExpiryTime(time.Now().Add(expiry)),
); err != nil {
return err
}
The error is the following:
link detached, reason: *Error{Condition: amqp:link:detach-forced, Description: Server Busy. Please retry operation, Info: map[]}
The Java SDK seems to have this comment regarding the error:
/**
* An operator intervened to detach for some reason.
*/
LINK_DETACH_FORCED("amqp:link:detach-forced"),
Same with the JS one: https://github.com/Azure/amqp-common-js/blob/master/lib/errors.ts#L171.
So to me it seems as if this error may occur from time to time. For me, it has always been solved with a restart, so I assume one way to handle it is to simply reconnect the client.
It seems to happen on a weekly basis. It could mean that Azure has some sort of timeout for 7 days and that we should gracefully reconnect when it occurs.
Some information from the Python library.
https://github.com/Azure/azure-sdk-for-python/blob/a7ec3bca94251b6a73de347112d4a77e6e615ccc/sdk/eventhub/azure-eventhub/TROUBLESHOOTING.md?plain=1#L32
All Event Hubs exceptions are wrapped in an [EventHubError][EventHubError]. They often have an underlying AMQP error code which specifies whether an error should be retried. For retryable errors (ie.
amqp:connection:forced
oramqp:link:detach-forced
), the client libraries will attempt to recover from these errors based on the [retry options][AmqpRetryOptions] specified when instantiating the client. To configure retry options, follow the sample [Client Creation][ClientCreation]. If the error is non-retryable, there is some configuration issue that needs to be resolved.
We believe the following code is the cause - once a link is detached, there's no retry to get a session and link going again.
https://github.com/amenzhinsky/iothub/blob/master/iotservice/client.go#L171-L189
Note how, upon an error when putting a token, we just return and won't try any more. Likely, we become unauthorized and kicked from the server and the link becomes detached.