NServiceBus.AmazonSQS icon indicating copy to clipboard operation
NServiceBus.AmazonSQS copied to clipboard

Excessive SNS topic queries when event has no subscribers leading to rate limit hits

Open BenSaneckiCJ opened this issue 1 year ago • 3 comments

Describe the bug

We've recently observed that when publishing an event with no subscribers, the system frequently hits the rate limit for SNS topic queries. This behavior is seemingly due to the absence of a topic found for the event, which might prevent the creation of a cache entry. As a result, every publish attempt tries to discern the event's destination. I noticed that there might already be some consideration given to this scenario, particularly around HybridPubSubChecker.ThisIsAPublishMessageNotUsingMessageDrivenPubSub, but I wanted to bring this to attention since it caused errors in our production environment.

Steps to reproduce

  1. Set up an environment where an event is being published but has no subscribers.
  2. Publish > 30 events per second.
  3. Monitor the number of SNS topic queries made by the system during this process.

Relevant log output

Amazon.SimpleNotificationService.AmazonSimpleNotificationServiceException: Rate exceeded
 ---> Amazon.Runtime.Internal.HttpErrorResponseException: Exception of type 'Amazon.Runtime.Internal.HttpErrorResponseException' was thrown.
   at Amazon.Runtime.HttpWebRequestMessage.GetResponseAsync(CancellationToken cancellationToken)
   at Amazon.Runtime.Internal.HttpHandler`1.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Unmarshaller.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
   --- End of inner exception stack trace ---
   at Amazon.Runtime.Internal.HttpErrorResponseExceptionHandler.HandleExceptionStream(IRequestContext requestContext, IWebResponseData httpErrorResponse, HttpErrorResponseException exception, Stream responseStream)
   at Amazon.Runtime.Internal.HttpErrorResponseExceptionHandler.HandleExceptionAsync(IExecutionContext executionContext, HttpErrorResponseException exception)
   at Amazon.Runtime.Internal.ExceptionHandler`1.HandleAsync(IExecutionContext executionContext, Exception exception)
   at Amazon.Runtime.Internal.ErrorHandler.ProcessExceptionAsync(IExecutionContext executionContext, Exception exception)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CredentialsRetriever.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorCallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.MetricsHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Datadog.Trace.ClrProfiler.CallTarget.Handlers.Continuations.TaskContinuationGenerator`4.SyncCallbackHandler.ContinuationAction(Task`1 previousTask, TTarget target, CallTargetState state)
   at Amazon.SimpleNotificationService.AmazonSimpleNotificationServiceClient.FindTopicAsync(String topicName)
   at NServiceBus.Transport.SQS.TopicCache.GetAndCacheTopicIfFound(MessageMetadata metadata) in /_/src/NServiceBus.Transport.SQS/TopicCache.cs:line 62
   at NServiceBus.Transport.SQS.MessageDispatcher.ApplyMulticastOperationMappingIfNecessary(MulticastTransportOperation transportOperation, SnsPreparedMessage snsPreparedMessage) in /_/src/NServiceBus.Transport.SQS/MessageDispatcher.cs:line 347
   at NServiceBus.Transport.SQS.MessageDispatcher.PrepareMessage[TMessage](IOutgoingTransportOperation transportOperation, HashSet`1 messageIdsOfMulticastedEvents, TransportTransaction transportTransaction) in /_/src/NServiceBus.Transport.SQS/MessageDispatcher.cs:line 336
   at NServiceBus.Transport.SQS.MessageDispatcher.Dispatch(MulticastTransportOperation transportOperation, HashSet`1 messageIdsOfMulticastedEvents, TransportTransaction transportTransaction) in /_/src/NServiceBus.Transport.SQS/MessageDispatcher.cs:line 202
   at NServiceBus.Transport.SQS.MessageDispatcher.Dispatch(TransportOperations outgoingMessages, TransportTransaction transaction, ContextBag context) in /_/src/NServiceBus.Transport.SQS/MessageDispatcher.cs:line 59

Additional Information

Workarounds

As the events are not currently being used anywhere we have stopped publishing them for the time being.

Possible solutions

Additional information

NServiceBus.AmazonSQS version 5.6.1 .NET6

BenSaneckiCJ avatar Sep 01 '23 16:09 BenSaneckiCJ

Thanks for raising this, @BenSaneckiCJ. I'm sorry for the issue you're facing.

I think I know where the problem is. We have a TopicCache to prevent those scenarios from happening. The problem is that we cache null search results (as in not found topics) only when the hybrid mode is enabled.

I'll have a look beginning of next week.

mauroservienti avatar Sep 01 '23 19:09 mauroservienti

@BenSaneckiCJ, I cannot reproduce the issue on 5.7, and 5.6.x is out of support, when NOT using the hybrid mode.

Are you using the hybrid pub/sub mode?

mauroservienti avatar Sep 02 '23 07:09 mauroservienti

It looks like we're not. I'll try updating to 5.7.2 and see if the issue comes back.

BenSaneckiCJ avatar Sep 05 '23 16:09 BenSaneckiCJ