node-statsd
node-statsd copied to clipboard
Added batching mode
Added batch
option, if true, the messages are batched either by 20 messages or by 1 second. Message of different kinds could be mixed in the batch. Callbacks for individual messages are called when the batch is sent.
Several tests included, verifying that the new mode works as expected.
What is the purpose of this? UDP doesn't do any handshaking, so there is no performance gain by batching these calls. From what I can tell it does not much more than delay the sending of the metrics. Can you further elaborate on what this is designed to accomplish?
Furthermore, delaying metrics getting sent to the StatsD daemon which already rolls up data by time-window, may throw off the metrics getting sent to the server depending on what the time-window for StatsD is versus the time window here.
@devdazed
- We save on networking: UDP+IP+Ethernet header lengths would give us 8+20+26 = 54 bytes overhead per message, which's a lot for these small messages.
- Benchmarks show, that we significantly save on the CPU - sending a bunch of small packets appears to be way more expensive then composing and sending a single small one.
Fair enough. It looks like the .close
method clears the timeout for sending batches, however, does not check to see if there are messages waiting to be sent and send them prior to closing the socket. It seems that this could result in messages lost when closing the connection, or am I missing where this occurs?
@devdazed Indeed, great catch, thank you. I'll fix that in an hour and notify you. Also thought it worths making the delays and message limit configurable
I think so, having those be configurable would be optimal.
@devdazed Done:
- Made the parameters configurable
- Made it send pending messages upon client close
- Added a test to verify that the bug's fixed
this looks good to me @sivy thoughts?
@devdazed I've amended the PR a bit. Realised it's better not to limit the number of messages in a batch, but the actual batch size, as that's what matters.
if your goal is to come in less than the MTU then shouldn't the packet header lengths be taken into account?
@devdazed That's why I'm not calling the option networkMTU
, but maxBatchLength
. I think it's more flexible to let the user decide on this without any magical subtractions of header lengths, because we don't know what's the underlying technology is used, and the total length of headers might be different from what we set here.
Please excuse my other-project interruption, but this feels like the best place to comment. Since nothing has been getting pulled into this project for awhile, I have been watching things here and pulled into a fork, https://github.com/brightcove/hot-shots. I was just looking to pull this in, but I see it duplicates a lot that's already in https://github.com/sivy/node-statsd/pull/60, so I'm going to have to skip it. (The one big difference is to add a batch delay, but that's not easy to merge as-is on top of the other changes.)