zipkin4net
zipkin4net copied to clipboard
Sending multiple spans per HTTP request
The Zipkin API supports sending arrays of spans instead of single spans. In the current implementation it's not possible to take advantage of this since spans are serialized one-by-one. By buffering the spans and sending them in batches, the amount of network chatter with the API can be significantly reduced for systems with lots of incoming and outgoing requests.
Based on the current design, it seems ZipkinTraceReporter
is the best place to make this change since it is the place where completed spans are serialized and sent. I would propose to buffer the spans here and asynchronously flush an array of spans to be serialized and sent. The Thrift serializers would need to be changed/regenerated as well to support serializing arrays instead of single spans.
Is this something you'd be interested in a PR for?
While it would be great for HTTP, I think most of Kafka transports already support batching messages. However, I must admit that it can be really useful for HTTP.
However, I'm sure to understand what ZipkinTraceReporter refers to. From the top of my head, I would suggest to introduce a new class between ZipkinTracer and IZipkinSender.
What do you have in mind ?
Good point. Using Confluent.Kafka you can set a queue buffering delay on the producer, so it waits for messages to buffer up before flushing. Having batched spans then could actually be harmful - if I recall correctly Kafka performs better with many small messages than a few large messages.
https://github.com/openzipkin/zipkin4net/blob/master/Src/zipkin4net/Src/Tracers/Zipkin/ZipkinTracerReporter.cs This is the one I was referring to and it already sits in between ZipkinTracer and IZipkinSender :) So we could add buffering there if we want to.
To address your point about Kafka though, a solution could be to do control the serialization from the sender - then the sender can decide to serialize the spans as a batch or separately, depending on what's preferable for the transport. In that case we wouldn't really need the ZipkinTracerReporter anymore, so the call chain would then look something like e.g.: ZipkinTracer -> HttpZipkinSender -> ThriftSerializer (serializing/sending spans as arrays) ZipkinTracer -> KafkaZipkinSender -> ThriftSerializer (serializing/sending spans separately)
In this case, the HttpZipkinSender would be responsible for buffering the spans - which would be consistent with a Kafka sender since in that case buffering would be handled by the Kafka client library.
Sorry for the wall of text - this issue is apparently a bit more complex than I first considered it to be 😄
food for thought but single span kafka messages are very much a v1 legacy thing. all transports should accept list messages even if there is additional buffering at the transport layer. this is the case for a couple years now (even before v2 format)
@adriancole That's good input. In that case my initial idea might work, which was to implement buffering in ZipkinTracerReporter
instead. Thoughts?
There was some efforts by yelp back in the day that ultimately led to sizing messages being more efficient even if our storage layer wasn't also more efficient if sent in batches. I definitely think bunding ideally to a message size or timeout is a nice thing. that logic would be the same for all transports but the message size different. ex kafka has an ideal message size which is different than http, but the act of bundling is exactly the same.
That's an interesting idea. This would make the serializers/senders quite a bit more complex though - since the size isn't known until after serialization, the sender would need to serialize the spans one by one up to the optimal message size. Then the serializer would need to wrap them such that they become an array, right?
That's an interesting idea. This would make the serializers/senders quite a bit more complex though - since the size isn't known until after serialization, the sender would need to serialize the spans one by one up to the optimal message size. Then the serializer would need to wrap them such that they become an array, right?
gating by size has yes an impact of needing to pre-size them. protocol buffers by default have a mechanism to pre-size a type. In the java code, I wrote a pre-sizer, which is indeed complicated (if you don't want to allocate along the way). That said, even having a "rough sizer" or a count-based limit should still be helpful. Most of the problem in presizing (in the v2 structs) is due to utf8 encoding, which you can fudge by assuming ascii or some factor.