Kamon icon indicating copy to clipboard operation
Kamon copied to clipboard

UnsupportedOperationException when reading trace context from kafla

Open dr3s opened this issue 2 years ago • 6 comments

UnsupportedOperationException when reading trace context from kafla

running kafka 2.8 with compression

Only occurs when tracing is set to continue

    kamon.instrumentation.kafka {
      client.tracing {
        continue-trace-on-consumer = yes
      }
    }

Error reading the context from the header.

I think the error is on this line when the readAll can't be invoked on the reader but it could also be any other UnsupportedOperation on the context here: https://github.com/kamon-io/Kamon/blob/master/core/kamon-core/src/main/scala/kamon/context/BinaryPropagation.scala#L170

void java.lang.Throwable.<init>(String) 142486 100 %
void java.lang.Exception.<init>(String) 142486 100 %
void java.lang.RuntimeException.<init>(String) 142486 100 %
void java.lang.UnsupportedOperationException.<init>(String) 142486 100 %
Try scala.util.Success.failed() 142486 100 %
Context kamon.context.BinaryPropagation$Default.read(BinaryPropagation$ByteStreamReader) 142485 100 %
Context kamon.context.BinaryPropagation$Default.read(Object) 142485 100 %
Context kamon.instrumentation.kafka.client.ContextSerializationHelper$.fromByteArray(byte[]) 142485 100 %
Context kamon.instrumentation.kafka.client.RecordProcessor$.$anonfun$process$2(Header) 142485 100 %
Object kamon.instrumentation.kafka.client.RecordProcessor$$$Lambda$1170.322308893.apply(Object) 142485 100 %
Option scala.Option.map(Function1) 142485 100 %
void kamon.instrumentation.kafka.client.RecordProcessor$.$anonfun$process$1(Instant, ConsumedRecordData$ConsumerInfo, ConsumerRecord) 142485 100 %
Object kamon.instrumentation.kafka.client.RecordProcessor$.$anonfun$process$1$adapted(Instant, ConsumedRecordData$ConsumerInfo, ConsumerRecord) 142485 100 %
Object kamon.instrumentation.kafka.client.RecordProcessor$$$Lambda$1169.1580561024.apply(Object) 142485 100 %

dr3s avatar Sep 09 '22 23:09 dr3s

Hey @dr3s,

I've been trying to reproduce this issue but no luck so far. I tried with corrupted headers, no headers, different settings, different Kafka versions and all of them worked out.

What additional info could you share to reproduce this issue? and, do you have access to the exception logged here from your logs? This that you have above seems like some sort of aggregate that might be missing some important info :thinking:

ivantopo avatar Sep 12 '22 11:09 ivantopo

This trace was from our profiler. For some reason I don't see the exceptions in the logs and we have the level set to INFO. To give you an idea of volume, I recorded about 1M exceptions per minute. I'll try to just debug locally and capture a trace.

In general though, I think it may be problematic in an APM library to use exceptions in this way. It seems the message we are seeing is from Success.failed https://github.com/scala/scala/blob/2.13.x/src/library/scala/util/Try.scala#L277 It would be better to log and return a failure type that doesn't wrap an exception as this code will init Throwable and appears to impact performance.

image

dr3s avatar Sep 12 '22 12:09 dr3s

Oh, I see it now!

The exception you are seeing is created by the Success.failed implementation, but it is never thrown! It means that every time a binary context is deserialized with the binary propagation we are creating an exception (with all the garbage that comes with it) and doing nothing with it :exploding_head:

I'll fix this ASAP.

ivantopo avatar Sep 12 '22 12:09 ivantopo

Thanks for sharing that screenshot @dr3s, everything clicked with it!

#1206 fixes this issue and should be out by the end of this week. Stay tuned for the next release!

ivantopo avatar Sep 12 '22 12:09 ivantopo

Right, I actually do this in my own code and it's probably bad practice. I don't know what the overhead is but Success.failed() probably shouldn't be doing it either. I wonder if it has changed in Scala 3.

I still don't know where that function is actually being invoked. Is it just because the Tryobject is being applied that the failed function constructs the exception?

dr3s avatar Sep 12 '22 13:09 dr3s

It gets called inside the binary propagation here: https://github.com/kamon-io/Kamon/blob/master/core/kamon-core/src/main/scala/kamon/context/BinaryPropagation.scala#L174-L176

The Try docs state this: image

So what is happening is that on the call site above we always have a success, which means .failed creates a wrapper with an UnsupportedOperationException that never calls that .foreach because there is no actual error.

ivantopo avatar Sep 12 '22 13:09 ivantopo