azure-signalr icon indicating copy to clipboard operation
azure-signalr copied to clipboard

Azure SignalR Service randomly drops connections immediately after successful connect

Open ph-ict opened this issue 3 years ago • 10 comments

Describe the bug

We have 4 virtually identical Azure SignalR services and two of them are behaving badly. Using the same client and hub, I can connect to two of them without issues. The other two have been exhibiting a strange bug since about Thursday/Friday last week where the client will often connect (via the Hub) and then instantly disconnect with the error:

Microsoft.AspNetCore.SignalR.HubException: The server closed the connection with the following error: Connection closed with an error.

If the client retries, sometimes up to 10 or more times it will eventually reconnect successfully and then work as expected.

We are not hitting any service limits, we are currently using the "Standard" size although they are not used in production (yet), but we need to get to the bottom of the issue to know that this won't happen again.

To Reproduce

I can email details on how to connect to our API with a reproducible code sample on request.

Exceptions (if any)

Microsoft.AspNetCore.SignalR.HubException: The server closed the connection with the following error: Connection closed with an error.

Further technical details

  • Microsoft.AspNetCore.SignalR.Client v5.0.11
  • Microsoft.AspNetCore.SignalR.Core v1.1.0

ph-ict avatar Nov 16 '21 22:11 ph-ict

Strangely, this condition seems to have resolved itself over the weekend some time. While that's good, it's not very reassuring that we had two services go bad at the same time for a week and mysteriously resolve themselves. Would appreciate someone looking into this nonetheless please.

ph-ict avatar Nov 22 '21 01:11 ph-ict

Please email me lianwei(at)microsoft.com your bad instance name and some failure error logs with timestamps for me to take a look. Thanks!

vicancy avatar Nov 23 '21 01:11 vicancy

I started to experience same issue last week. Don't see any errors in my app logs, haven't updated Microsoft.Azure.SignalR (1.13.0). Even if I remove all of the logic from my hub and leave it as follow

[Authorize]                                                                                     
public class EventsHub : Hub                                                                    
{                                                                                               
    public EventsHub()                                                                          
    {                                                                                           
    }                                                                                           
                                                                                                
    public Task SendEventAsync(IExternalDomainEvent @event, CancellationToken cancellationToken)
    {                                                                                           
        return Task.CompletedTask;                                                              
    }                                                                                           
}                                                                                               

The server closed the connection with the following error: Connection closed with an error.error is showing up.

I have three instances (UKS region) of Azure SignalR standard and all of them are experiencing same issue.

Tried updating Microsoft.Azure.SignalR to latest 1.15.0, but that didn't help.

robertlyson avatar Jan 25 '22 20:01 robertlyson

Interesting. Unfortunately our issues resolved after a week or so before we could diagnose them fully, and investigation from the Microsoft team didn't reveal anything useful. It would be great if you manage to work out what is going wrong. I created a simple client that connected successfully to the working instance and failed connecting to the faulty ones. If you can do the same and send them timestamps perhaps they can work out what's going on.

Send an email to @vicancy as above and please let me know what you find.

ph-ict avatar Jan 25 '22 20:01 ph-ict

@ph-ict many thanks for your update :+1:

robertlyson avatar Jan 25 '22 21:01 robertlyson

Interestingly, I was able to reproduce the issue 100% times before 9PM UTC, now after 9:10PM UTC it's gone.

robertlyson avatar Jan 25 '22 21:01 robertlyson

:o That is the same thing that concerned me - the issues started one day and stopped a week or so later without any changes :(

Any new Azure SingalR instance I created worked fine, as did one of the instances created at the same time that we hadn't started using yet, but the other two were failing. Then suddenly they both came right again at the same time. I'm convinced it's something on the Azure end. Weird.

ph-ict avatar Jan 25 '22 22:01 ph-ict

Oh and for completeness, our instances were all in AU-East

ph-ict avatar Jan 25 '22 22:01 ph-ict

I've just been getting this today, and I think i've had it other times too. Did anything come of this? This is also Australia East

schotime avatar May 09 '22 11:05 schotime

Ok, turns out mine was that the message exceeded 32KB, and I had to turn EnableDetailedErrors on to find the actual error.

schotime avatar May 09 '22 12:05 schotime