azure-sdk-for-java icon indicating copy to clipboard operation
azure-sdk-for-java copied to clipboard

[BUG] We sometimes (incorrectly) get "Error sending. Size of the payload exceeded maximum message size" error when scheduling a message

Open haraldk opened this issue 1 year ago • 4 comments

Describe the bug We sometimes (I believe incorrectly) get "Error sending. Size of the payload exceeded maximum message size: 256 kb" error when scheduling a message. The message is a copy of a received message, with a single added client property to control the number of times it is re-scheduled. The error message does not make sense to me, as the message was previously sent fine, and the full message body is at most a few hundred bytes (< 1 kb). It is nowhere near the 256 kb limit.

The scheduling (usually) also works after a few retries.

The cause of the exception, however, says "java.lang.IndexOutOfBoundsException: Requested min remaining bytes(155) exceeds remaining(137) in underlying ByteBuffer: java.nio.HeapByteBuffer[pos=636 lim=773 cap=773]". That makes more sense, but I don't know how/why the limit is set to 773 bytes, as that is outside the control of the client code.

Exception or Stack Trace

com.azure.messaging.servicebus.ServiceBusException: Error sending. Size of the payload exceeded maximum message size: 256 kb, errorContext[NAMESPACE: sbn-lydteamet-lydbanken-prod-azeno-1.servicebus.windows.net. ERROR CONTEXT: N/A, PATH: lydbanken-integrasjoner-virtuoso-updater/$management, REFERENCE_ID: lydbanken-integrasjoner-virtuoso-updater-mgmt:receiver, LINK_CREDIT: 0]
	at com.azure.messaging.servicebus.MessagePump.notifyMessage(MessagePump.java:159)
	at com.azure.messaging.servicebus.MessagePump.lambda$handleMessage$3(MessagePump.java:142)
	at com.azure.messaging.servicebus.implementation.instrumentation.ServiceBusReceiverInstrumentation.wrap(ServiceBusReceiverInstrumentation.java:169)
	at com.azure.messaging.servicebus.implementation.instrumentation.ServiceBusReceiverInstrumentation.instrumentProcess(ServiceBusReceiverInstrumentation.java:104)
	at com.azure.messaging.servicebus.MessagePump.handleMessage(MessagePump.java:135)
	at com.azure.messaging.servicebus.MessagePump$RunOnWorker.lambda$apply$0(MessagePump.java:212)
	at reactor.core.publisher.MonoRunnable.call(MonoRunnable.java:73)
	at reactor.core.publisher.MonoRunnable.call(MonoRunnable.java:32)
	at reactor.core.publisher.FluxSubscribeOnCallable$CallableSubscribeOnSubscription.run(FluxSubscribeOnCallable.java:228)
	at reactor.core.scheduler.ImmediateScheduler.schedule(ImmediateScheduler.java:52)
	at reactor.core.publisher.MonoSubscribeOnCallable.subscribe(MonoSubscribeOnCallable.java:52)
	at reactor.core.publisher.Mono.subscribe(Mono.java:4568)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.onNext(FluxFlatMap.java:430)
	at com.azure.messaging.servicebus.TracingFluxOperator.lambda$hookOnNext$0(TracingFluxOperator.java:67)
	at com.azure.messaging.servicebus.implementation.instrumentation.ServiceBusReceiverInstrumentation.instrumentProcess(ServiceBusReceiverInstrumentation.java:97)
	at com.azure.messaging.servicebus.TracingFluxOperator$1.lambda$subscribe$0(TracingFluxOperator.java:39)
	at com.azure.messaging.servicebus.TracingFluxOperator.hookOnNext(TracingFluxOperator.java:66)
	at reactor.core.publisher.BaseSubscriber.onNext(BaseSubscriber.java:160)
	at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:122)
	at com.azure.core.amqp.implementation.MessageFlux$RecoverableReactorReceiver.drainLoop(MessageFlux.java:476)
	at com.azure.core.amqp.implementation.MessageFlux$RecoverableReactorReceiver.drain(MessageFlux.java:405)
	at com.azure.core.amqp.implementation.MessageFlux$ReactorReceiverMediator.onNext(MessageFlux.java:878)
	at com.azure.core.amqp.implementation.MessageFlux$ReactorReceiverMediator.onNext(MessageFlux.java:722)
	at reactor.core.publisher.FluxPublishOn$PublishOnSubscriber.runAsync(FluxPublishOn.java:446)
	at reactor.core.publisher.FluxPublishOn$PublishOnSubscriber.run(FluxPublishOn.java:533)
	at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:84)
	at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:37)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: com.azure.messaging.servicebus.ServiceBusException: Error sending. Size of the payload exceeded maximum message size: 256 kb, errorContext[NAMESPACE: sbn-lydteamet-lydbanken-prod-azeno-1.servicebus.windows.net. ERROR CONTEXT: N/A, PATH: lydbanken-integrasjoner-virtuoso-updater/$management, REFERENCE_ID: lydbanken-integrasjoner-virtuoso-updater-mgmt:receiver, LINK_CREDIT: 0]
	at com.azure.messaging.servicebus.ServiceBusSenderAsyncClient.mapError(ServiceBusSenderAsyncClient.java:950)
	at reactor.core.publisher.Mono.lambda$onErrorMap$28(Mono.java:3854)
	at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:94)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:241)
	at reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:315)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:241)
	at reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:315)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:241)
	at reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:315)
	at reactor.core.publisher.MonoNext$NextSubscriber.onError(MonoNext.java:93)
	at reactor.core.publisher.FluxConcatArray$ConcatArraySubscriber.onError(FluxConcatArray.java:186)
	at reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain.onError(MonoFlatMapMany.java:205)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:149)
	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1865)
	at com.azure.core.amqp.implementation.AmqpChannelProcessor.subscribe(AmqpChannelProcessor.java:283)
	at reactor.core.publisher.Flux.subscribe(Flux.java:8840)
	at reactor.core.publisher.FluxConcatArray$ConcatArraySubscriber.onComplete(FluxConcatArray.java:238)
	at reactor.core.publisher.FluxConcatArray.subscribe(FluxConcatArray.java:79)
	at reactor.core.publisher.MonoFromFluxOperator.subscribe(MonoFromFluxOperator.java:83)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:158)
	at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:122)
	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1865)
	at reactor.core.publisher.MonoCacheInvalidateIf.subscribeOrReturn(MonoCacheInvalidateIf.java:172)
	at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:63)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)
	at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2571)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.request(MonoFlatMap.java:194)
	at reactor.core.publisher.MonoFlatMap$FlatMapInner.onSubscribe(MonoFlatMap.java:291)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onSubscribe(MonoFlatMap.java:117)
	at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:55)
	at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:53)
	at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:76)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)
	at reactor.core.publisher.FluxPeekFuseable$PeekFuseableSubscriber.onNext(FluxPeekFuseable.java:210)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondComplete(MonoFlatMap.java:245)
	at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:305)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondComplete(MonoFlatMap.java:245)
	at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:305)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
	at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2571)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.request(FluxMapFuseable.java:171)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.request(FluxMapFuseable.java:171)
	at reactor.core.publisher.MonoFlatMap$FlatMapInner.onSubscribe(MonoFlatMap.java:291)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onSubscribe(FluxMapFuseable.java:96)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onSubscribe(FluxMapFuseable.java:96)
	at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:55)
	at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:76)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondComplete(MonoFlatMap.java:245)
	at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:305)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.complete(MonoIgnoreThen.java:294)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onNext(MonoIgnoreThen.java:188)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.subscribeNext(MonoIgnoreThen.java:237)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onComplete(MonoIgnoreThen.java:204)
	at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onComplete(MonoPeekTerminal.java:299)
	at reactor.core.publisher.SerializedSubscriber.onComplete(SerializedSubscriber.java:146)
	at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.onComplete(FluxTimeout.java:235)
	at reactor.core.publisher.MonoNext$NextSubscriber.onComplete(MonoNext.java:102)
	at reactor.core.publisher.MonoNext$NextSubscriber.onNext(MonoNext.java:83)
	at reactor.core.publisher.FluxFilterFuseable$FilterFuseableSubscriber.onNext(FluxFilterFuseable.java:118)
	at reactor.core.publisher.FluxReplay$SizeBoundReplayBuffer.replayNormal(FluxReplay.java:877)
	at reactor.core.publisher.FluxReplay$SizeBoundReplayBuffer.replay(FluxReplay.java:965)
	at reactor.core.publisher.FluxReplay$ReplayInner.request(FluxReplay.java:1711)
	at reactor.core.publisher.FluxFilterFuseable$FilterFuseableSubscriber.request(FluxFilterFuseable.java:191)
	at reactor.core.publisher.MonoNext$NextSubscriber.request(MonoNext.java:108)
	at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.request(Operators.java:2331)
	at reactor.core.publisher.SerializedSubscriber.request(SerializedSubscriber.java:151)
	at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.request(MonoPeekTerminal.java:139)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onSubscribe(MonoIgnoreThen.java:135)
	at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onSubscribe(MonoPeekTerminal.java:152)
	at reactor.core.publisher.SerializedSubscriber.onSubscribe(SerializedSubscriber.java:65)
	at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.onSubscribe(FluxTimeout.java:155)
	at reactor.core.publisher.MonoNext$NextSubscriber.onSubscribe(MonoNext.java:70)
	at reactor.core.publisher.FluxFilterFuseable$FilterFuseableSubscriber.onSubscribe(FluxFilterFuseable.java:87)
	at reactor.core.publisher.FluxReplay.subscribeOrReturn(FluxReplay.java:1181)
	at reactor.core.publisher.FluxReplay.subscribe(FluxReplay.java:1153)
	at reactor.core.publisher.FluxAutoConnectFuseable.subscribe(FluxAutoConnectFuseable.java:61)
	at reactor.core.publisher.Mono.subscribe(Mono.java:4568)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.subscribeNext(MonoIgnoreThen.java:265)
	at reactor.core.publisher.MonoIgnoreThen.subscribe(MonoIgnoreThen.java:51)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)
	at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)
	at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onNext(MonoPeekTerminal.java:180)
	at reactor.core.publisher.FluxHide$SuppressFuseableSubscriber.onNext(FluxHide.java:137)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.complete(MonoIgnoreThen.java:294)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onNext(MonoIgnoreThen.java:188)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.subscribeNext(MonoIgnoreThen.java:237)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onComplete(MonoIgnoreThen.java:204)
	at reactor.core.publisher.SerializedSubscriber.onComplete(SerializedSubscriber.java:146)
	at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.onComplete(FluxTimeout.java:235)
	at reactor.core.publisher.MonoNext$NextSubscriber.onComplete(MonoNext.java:102)
	at reactor.core.publisher.MonoNext$NextSubscriber.onNext(MonoNext.java:83)
	at reactor.core.publisher.FluxFilterFuseable$FilterFuseableSubscriber.onNext(FluxFilterFuseable.java:118)
	at reactor.core.publisher.FluxReplay$SizeBoundReplayBuffer.replayNormal(FluxReplay.java:877)
	at reactor.core.publisher.FluxReplay$SizeBoundReplayBuffer.replay(FluxReplay.java:965)
	at reactor.core.publisher.FluxReplay$ReplayInner.request(FluxReplay.java:1711)
	at reactor.core.publisher.FluxFilterFuseable$FilterFuseableSubscriber.request(FluxFilterFuseable.java:191)
	at reactor.core.publisher.MonoNext$NextSubscriber.request(MonoNext.java:108)
	at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.request(Operators.java:2331)
	at reactor.core.publisher.SerializedSubscriber.request(SerializedSubscriber.java:151)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onSubscribe(MonoIgnoreThen.java:135)
	at reactor.core.publisher.SerializedSubscriber.onSubscribe(SerializedSubscriber.java:65)
	at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.onSubscribe(FluxTimeout.java:155)
	at reactor.core.publisher.MonoNext$NextSubscriber.onSubscribe(MonoNext.java:70)
	at reactor.core.publisher.FluxFilterFuseable$FilterFuseableSubscriber.onSubscribe(FluxFilterFuseable.java:87)
	at reactor.core.publisher.FluxReplay.subscribeOrReturn(FluxReplay.java:1181)
	at reactor.core.publisher.FluxReplay.subscribe(FluxReplay.java:1153)
	at reactor.core.publisher.FluxAutoConnectFuseable.subscribe(FluxAutoConnectFuseable.java:61)
	at reactor.core.publisher.Mono.subscribe(Mono.java:4568)
	at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.subscribeNext(MonoIgnoreThen.java:265)
	at reactor.core.publisher.MonoIgnoreThen.subscribe(MonoIgnoreThen.java:51)
	at reactor.core.publisher.Mono.subscribe(Mono.java:4568)
	at reactor.core.publisher.FluxFlatMap.trySubscribeScalarMap(FluxFlatMap.java:205)
	at reactor.core.publisher.MonoFlatMap.subscribeOrReturn(MonoFlatMap.java:53)
	at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:63)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)
	at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:122)
	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1865)
	at reactor.core.publisher.MonoCacheInvalidateIf.subscribeOrReturn(MonoCacheInvalidateIf.java:172)
	at reactor.core.publisher.Mono.subscribe(Mono.java:4552)
	at reactor.core.publisher.Mono.block(Mono.java:1806)
	at com.azure.messaging.servicebus.ServiceBusSenderClient.scheduleMessage(ServiceBusSenderClient.java:362)
	at no.nrk.lydbanken.integrasjoner.common.infrastructure.servicebus.ServiceBusQueueConsumer.processMessage(ServiceBusQueueConsumer.java:115)
	at com.azure.messaging.servicebus.MessagePump.notifyMessage(MessagePump.java:157)
	... 31 common frames omitted
	Suppressed: java.lang.Exception: #block terminated with an error
		at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:146)
		at reactor.core.publisher.Mono.block(Mono.java:1807)
		... 34 common frames omitted
Caused by: com.azure.core.amqp.exception.AmqpException: Error sending. Size of the payload exceeded maximum message size: 256 kb, errorContext[NAMESPACE: sbn-lydteamet-lydbanken-prod-azeno-1.servicebus.windows.net. ERROR CONTEXT: N/A, PATH: lydbanken-integrasjoner-virtuoso-updater/$management, REFERENCE_ID: lydbanken-integrasjoner-virtuoso-updater-mgmt:receiver, LINK_CREDIT: 0]
	at com.azure.messaging.servicebus.implementation.ManagementChannel.lambda$schedule$13(ManagementChannel.java:355)
	at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:132)
	... 145 common frames omitted
Caused by: java.nio.BufferOverflowException: null
	at org.apache.qpid.proton.codec.WritableBuffer$ByteBufferWrapper.ensureRemaining(WritableBuffer.java:152)
	at org.apache.qpid.proton.codec.BinaryType.fastWrite(BinaryType.java:75)
	at org.apache.qpid.proton.codec.EncoderImpl.writeBinary(EncoderImpl.java:535)
	at org.apache.qpid.proton.codec.messaging.FastPathDataType.write(FastPathDataType.java:126)
	at org.apache.qpid.proton.codec.messaging.FastPathDataType.write(FastPathDataType.java:36)
	at org.apache.qpid.proton.codec.EncoderImpl.writeObject(EncoderImpl.java:734)
	at org.apache.qpid.proton.message.impl.MessageImpl.encode(MessageImpl.java:740)
	at org.apache.qpid.proton.message.impl.MessageImpl.encode(MessageImpl.java:696)
	at com.azure.messaging.servicebus.implementation.ManagementChannel.lambda$schedule$13(ManagementChannel.java:349)
	... 146 common frames omitted
Caused by: java.lang.IndexOutOfBoundsException: Requested min remaining bytes(155) exceeds remaining(137) in underlying ByteBuffer: java.nio.HeapByteBuffer[pos=636 lim=773 cap=773]
	at org.apache.qpid.proton.codec.WritableBuffer$ByteBufferWrapper.ensureRemaining(WritableBuffer.java:148)
	... 154 common frames omitted

To Reproduce Steps to reproduce the behavior:

This happens seemingly at random, and only a very small percentage of the times we invoke com.azure.messaging.servicebus.ServiceBusSenderClient.scheduleMessage. I don't have a reproducer that can pinpoint the problem at this point, but the full method where the problem happens is attached below. I believe running the code below either by only using the exception path, or by replacing messageConsumer.consume(message) with something like throw new RuntimeException("Oh no") and running it enough times will show the problem.

As mentioned, the message is (usually) scheduled after a few retries.

Code Snippet Add the code snippet that causes the issue.

Our processMessage method (passed on to ServiceBusProcessorClientBuilder.processMessage()) looks like this:

    public void processMessage(ServiceBusReceivedMessageContext serviceBusReceivedMessageContext) {
        ServiceBusReceivedMessage message = serviceBusReceivedMessageContext.getMessage();

        try {
            messageConsumer.consume(message); // Client message processing. Fails with an exception in this case
            serviceBusReceivedMessageContext.complete();
        } catch (Exception exception) {
            int deliveryCount = Optional.ofNullable((Integer) message.getApplicationProperties().get(X_NRK_DELIVERY_COUNT))
                                        .orElse(1);

            if (deliveryCount < 20) {
                int delay = Math.min(1 << (deliveryCount - 1), 120); // Inlined calculation of exponential delay 

                ServiceBusMessage retryMessage = new ServiceBusMessage(message);
                retryMessage.getApplicationProperties().put(X_NRK_DELIVERY_COUNT, deliveryCount + 1);
                serviceBusSenderClient.scheduleMessage(retryMessage, OffsetDateTime.now().plusSeconds(delay)); // Where it fails

                serviceBusReceivedMessageContext.complete();
            } else {
                DeadLetterOptions deadLetterOptions = new DeadLetterOptions()
                                                              .setPropertiesToModify(Map.of(X_NRK_DELIVERY_COUNT, 0)) // reset delivery count to prepare for possibility to put message back to queue later
                                                              .setDeadLetterReason(exception.getClass().getName())
                                                              .setDeadLetterErrorDescription(exception.getMessage());

                serviceBusReceivedMessageContext.deadLetter(deadLetterOptions);
            }
        }
    }

    static final String X_NRK_DELIVERY_COUNT = "x-nrk-delivery-count";

Expected behavior

The message should be scheduled without exception.

Screenshots If applicable, add screenshots to help explain your problem.

Setup (please complete the following information):

  • OS: n/a, runs in Kubernetes, on AKS
  • IDE: n/a, runs in Kubernetes, on AKS
  • Library/Libraries:
    • com.azure:azure-core:1.51.0
    • com.azure:azure-core-ampq:2.9.7
    • com.azure:azure-core-http-netty:1.15.2
    • com.azure:azure-json:1.2.0
    • com.azure:azure-messaging-servicebus:7.17.2
    • com.azure:azure-xml:1.1.0
    • org.apache.qpid:proton-j:0.34.1
  • Java version: 21
  • App Server/Environment: Tomcat
  • Frameworks: Spring Boot

Additional context Add any other context about the problem here.

Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • [x] Bug Description Added
  • [x] Repro Steps Added
  • [x] Setup information Added

haraldk avatar Aug 22 '24 15:08 haraldk

@anuchandy @conniey @lmolkova

github-actions[bot] avatar Aug 22 '24 15:08 github-actions[bot]

Thank you for your feedback. Tagging and routing to the team member best able to assist.

github-actions[bot] avatar Aug 22 '24 15:08 github-actions[bot]

Hi @haraldk, could you help with a reference message that ran into this error?

Basically, you would update code to print one of the impacted messages, which I can use to reproduce this. The code change look likes

try {
    serviceBusSenderClient.scheduleMessage(retryMessage, OffsetDateTime.now().plusSeconds(delay));
} catch (ServiceBusException sbe) {
    if (sbe.getCause() != null && sbe.getCause() instanceof AmqpException) {
        final AmqpException amqpException = (AmqpException) sbe.getCause();
        if (amqpException.getErrorCondition() == AmqpErrorCondition.LINK_PAYLOAD_SIZE_EXCEEDED) {
            final Throwable innerCause = amqpException.getCause();
            if (innerCause != null && innerCause instanceof BufferOverflowException) {
                System.out.println(innerCause.getMessage());
                MessagePrinter.print(retryMessage);
            }
        }
    }
    throw sbe;
}

The "MessagePrinter" utility class used above can be found here https://gist.github.com/anuchandy/5b08d59e2ac0528dc58b99541c8765f1

anuchandy avatar Aug 27 '24 21:08 anuchandy

Thanks @anuchandy! I've modified your code slightly to use our logging, but it should otherwise output the same details. Will update the issue with the logged information as soon as possible.

haraldk avatar Aug 28 '24 13:08 haraldk

Here's the output of the extra output from a few occurrences extracted from our logs:

Exception message: null

Message {  header=Header{durable=null, priority=4, ttl=1209600000, firstAcquirer=null, deliveryCount=null} properties=Properties{messageId=c344a9cf16bc4926b5bc3426c88b7810, userId=, to='null', subject='https://lydbanken.kubeint.nrk.no/api/archive/record/cab3ff5d-d576-4269-b3ff-5dd576c269de', replyTo='null', correlationId=null, contentType=application/vnd.nrk.lydbanken.archive.v1+json, contentEncoding=null, absoluteExpiryTime=Thu Sep 12 10:45:10 GMT 2024, creationTime=Thu Aug 29 10:45:10 GMT 2024, groupId='null', groupSequence=0, replyToGroupId='null'} application_properties=ApplicationProperties{{x-nrk-delivery-count=2}} message_annotations=MessageAnnotations{{x-opt-scheduled-enqueue-time=Thu Aug 29 10:45:11 GMT 2024}} delivery_annotations=DeliveryAnnotations{{x-opt-transfer-source=SBN-LYDTEAMET-LYDBANKEN-PROD-AZENO-1:TOPIC:LYDBANKEN-ARCHIVE-EVENTS|LYDBANKEN-INTEGRASJONER-VIRTUOSO-UPDATER#24ab2f09-c7bd-4a17-9c22-366989a8048a[5], x-opt-transfer-sn=21384, x-opt-transfer-session=null, x-opt-transfer-hop-count=1, x-opt-transfer-resource=27911146, x-opt-lock-token=15eea98e-c2dc-4b08-8839-05bf80be5555}} body=Data{{"uri":"https://lydbanken.kubeint.nrk.no/api/archive/record/cab3ff5d-d576-4269-b3ff-5dd576c269de","version":3,"changed":"2024-08-29T10:45:09.758605392Z"}} footer=Footer{{}} }

MessageType {  application_properties={x-nrk-delivery-count=java.lang.Integer, } message_annotations={x-opt-scheduled-enqueue-time=java.util.Date, } delivery_annotations={x-opt-transfer-source=java.lang.String, x-opt-transfer-sn=java.lang.Long, x-opt-transfer-session=NAN, x-opt-transfer-hop-count=java.lang.Integer, x-opt-transfer-resource=java.lang.Long, x-opt-lock-token=java.util.UUID, } }


Exception message: null

Message {  header=Header{durable=null, priority=4, ttl=1209600000, firstAcquirer=null, deliveryCount=null} properties=Properties{messageId=0d6c5bb09140495882da6abfa2e9cb5e, userId=, to='null', subject='https://lydbanken.kubeint.nrk.no/api/archive/record/61054037-fff3-449e-8540-37fff3b49e77', replyTo='null', correlationId=null, contentType=application/vnd.nrk.lydbanken.archive.v1+json, contentEncoding=null, absoluteExpiryTime=Thu Sep 12 10:15:22 GMT 2024, creationTime=Thu Aug 29 10:15:22 GMT 2024, groupId='null', groupSequence=0, replyToGroupId='null'} application_properties=ApplicationProperties{{x-nrk-delivery-count=2}} message_annotations=MessageAnnotations{{x-opt-scheduled-enqueue-time=Thu Aug 29 10:15:29 GMT 2024}} delivery_annotations=DeliveryAnnotations{{x-opt-transfer-source=SBN-LYDTEAMET-LYDBANKEN-PROD-AZENO-1:TOPIC:LYDBANKEN-ARCHIVE-EVENTS|LYDBANKEN-INTEGRASJONER-VIRTUOSO-UPDATER#24ab2f09-c7bd-4a17-9c22-366989a8048a[5], x-opt-transfer-sn=21349, x-opt-transfer-session=null, x-opt-transfer-hop-count=1, x-opt-transfer-resource=27911146, x-opt-lock-token=88f9eabc-7452-4168-a0e1-74b5ba3b5b8d}} body=Data{{"uri":"https://lydbanken.kubeint.nrk.no/api/archive/record/61054037-fff3-449e-8540-37fff3b49e77","version":1,"changed":"2024-08-29T10:15:22.092180924Z"}} footer=Footer{{}} }

MessageType {  application_properties={x-nrk-delivery-count=java.lang.Integer, } message_annotations={x-opt-scheduled-enqueue-time=java.util.Date, } delivery_annotations={x-opt-transfer-source=java.lang.String, x-opt-transfer-sn=java.lang.Long, x-opt-transfer-session=NAN, x-opt-transfer-hop-count=java.lang.Integer, x-opt-transfer-resource=java.lang.Long, x-opt-lock-token=java.util.UUID, } }


Exception message: null

Message {  header=Header{durable=null, priority=4, ttl=1209600000, firstAcquirer=null, deliveryCount=null} properties=Properties{messageId=0d6c5bb09140495882da6abfa2e9cb5e, userId=, to='null', subject='https://lydbanken.kubeint.nrk.no/api/archive/record/61054037-fff3-449e-8540-37fff3b49e77', replyTo='null', correlationId=null, contentType=application/vnd.nrk.lydbanken.archive.v1+json, contentEncoding=null, absoluteExpiryTime=Thu Sep 12 10:15:22 GMT 2024, creationTime=Thu Aug 29 10:15:22 GMT 2024, groupId='null', groupSequence=0, replyToGroupId='null'} application_properties=ApplicationProperties{{x-nrk-delivery-count=2}} message_annotations=MessageAnnotations{{x-opt-scheduled-enqueue-time=Thu Aug 29 10:15:29 GMT 2024}} delivery_annotations=DeliveryAnnotations{{x-opt-transfer-source=SBN-LYDTEAMET-LYDBANKEN-PROD-AZENO-1:TOPIC:LYDBANKEN-ARCHIVE-EVENTS|LYDBANKEN-INTEGRASJONER-VIRTUOSO-UPDATER#24ab2f09-c7bd-4a17-9c22-366989a8048a[5], x-opt-transfer-sn=21349, x-opt-transfer-session=null, x-opt-transfer-hop-count=1, x-opt-transfer-resource=27911146, x-opt-lock-token=88f9eabc-7452-4168-a0e1-74b5ba3b5b8d}} body=Data{{"uri":"https://lydbanken.kubeint.nrk.no/api/archive/record/61054037-fff3-449e-8540-37fff3b49e77","version":1,"changed":"2024-08-29T10:15:22.092180924Z"}} footer=Footer{{}} }

MessageType {  application_properties={x-nrk-delivery-count=java.lang.Integer, } message_annotations={x-opt-scheduled-enqueue-time=java.util.Date, } delivery_annotations={x-opt-transfer-source=java.lang.String, x-opt-transfer-sn=java.lang.Long, x-opt-transfer-session=NAN, x-opt-transfer-hop-count=java.lang.Integer, x-opt-transfer-resource=java.lang.Long, x-opt-lock-token=java.util.UUID, } }


Exception message: null

Message {  header=Header{durable=null, priority=4, ttl=1209600000, firstAcquirer=null, deliveryCount=null} properties=Properties{messageId=d6059e2e70304d3a922f71b2d944f462, userId=, to='null', subject='https://lydbanken.kubeint.nrk.no/api/archive/record/658825c8-b0d5-484e-8825-c8b0d5f84ea6', replyTo='null', correlationId=null, contentType=application/vnd.nrk.lydbanken.archive.v1+json, contentEncoding=null, absoluteExpiryTime=Thu Sep 12 10:15:15 GMT 2024, creationTime=Thu Aug 29 10:15:15 GMT 2024, groupId='null', groupSequence=0, replyToGroupId='null'} application_properties=ApplicationProperties{{x-nrk-delivery-count=2}} message_annotations=MessageAnnotations{{x-opt-scheduled-enqueue-time=Thu Aug 29 10:15:17 GMT 2024}} delivery_annotations=DeliveryAnnotations{{x-opt-transfer-source=SBN-LYDTEAMET-LYDBANKEN-PROD-AZENO-1:TOPIC:LYDBANKEN-ARCHIVE-EVENTS|LYDBANKEN-INTEGRASJONER-VIRTUOSO-UPDATER#24ab2f09-c7bd-4a17-9c22-366989a8048a[5], x-opt-transfer-sn=21324, x-opt-transfer-session=null, x-opt-transfer-hop-count=1, x-opt-transfer-resource=27911146, x-opt-lock-token=b4fd5abc-184a-44a9-9bb4-eac3797588b3}} body=Data{{"uri":"https://lydbanken.kubeint.nrk.no/api/archive/record/658825c8-b0d5-484e-8825-c8b0d5f84ea6","version":2,"changed":"2024-08-29T10:15:15.570698376Z"}} footer=Footer{{}} }

MessageType {  application_properties={x-nrk-delivery-count=java.lang.Integer, } message_annotations={x-opt-scheduled-enqueue-time=java.util.Date, } delivery_annotations={x-opt-transfer-source=java.lang.String, x-opt-transfer-sn=java.lang.Long, x-opt-transfer-session=NAN, x-opt-transfer-hop-count=java.lang.Integer, x-opt-transfer-resource=java.lang.Long, x-opt-lock-token=java.util.UUID, } }

The exception message wasn't very helpful in this case, but I hope the extra details can help in finding the issue! 😀

haraldk avatar Aug 30 '24 14:08 haraldk

Hi @haraldk, thanks! it looks like the value of the field "Message.Properties.userId" is redacted, possibly a filter in the logging configuration? Could you double check and if so, can we output few new impacted messages again including this field value?

anuchandy avatar Sep 01 '24 17:09 anuchandy

Hi @@haraldk, I can repro using the shared messages, no need to pull another set of messages (with userId)

anuchandy avatar Sep 01 '24 19:09 anuchandy

hello @haraldk, we have a fix for this https://github.com/Azure/azure-sdk-for-java/pull/41706, thanks for helping with the details and test messages!

anuchandy avatar Sep 03 '24 15:09 anuchandy

Excellent news! I’ll have look, and will test as soon as possible. 😀

haraldk avatar Sep 03 '24 16:09 haraldk

The version (7.17.14) with the fix should be available in September release. Closing, now that the changes are merged to the main.

anuchandy avatar Sep 03 '24 21:09 anuchandy