cube icon indicating copy to clipboard operation
cube copied to clipboard

Added multi-processor support for cube servers

Open godsflaw opened this issue 12 years ago • 4 comments

If options.workers is absent, this code should fork a worker for every CPU on the host. Obviously options.workers may be used to override this default. Please update wiki documentation accordingly.

godsflaw avatar Jan 03 '13 22:01 godsflaw

I like the idea of this but I don't know a lot about the node cluster module. Does it support udp and websockets and everything else?

I might split this out into a separate bin file to keep the basic collector/emitter very simple and clear.

I worry a bit about some of the timing logic in event.js, around setInterval (not in your code, which is quite clear). I'm still getting familiar with the small details of Cube so I'm not sure if there are weird race conditions or if there's any duplicated work if there's more than one collector or evaluator handling the same event type?

Also - have you run into CPU issues with only a single instance?

RandomEtc avatar Mar 05 '13 02:03 RandomEtc

I think it is wise to dig and see if there are any race conditions. We did run into problems with inserts and especially queries against a single server, which may have caused me to make my change a little hastily. At quick inspection, it looked like everything was contained well within a single server instance. That is, it looks like one could get parallelism simply by running more collectors and evaluators, which made me think it was ideal for cluster.

We've been running this code for a few months and it's handling 250 largish documents a second with more than 10 indexes in one of the event collections. It produced the speedup we needed, and appears to run well.

It is worth noting that, for some of my stats where I present a percentage, very rarely I will get values back (and cache them) that are over 100%. This throws the cubism graphs scaling off. This bug, however, could exist in a number of places and is likely unrelated. Other than that, all my other limiting factors are related to MongoDB, and there are no other observable bugs.

godsflaw avatar Mar 05 '13 20:03 godsflaw

Thanks for the extra notes. Let's keep this pull request open for the time being - if anyone has time to look more thoroughly at the use of intervals and timeouts in Cube and how they interact with node's cluster module then please post here. If I start looking into it I'll post back with an update.

RandomEtc avatar Mar 05 '13 21:03 RandomEtc

We are also running horizontally scaled collectors and evaluators, but we did it before cluster made it to node. It seems to go well on our side too except for one thing : if you plug the collectors to a collectd, you must never send any "derive" event, since these events depend on the previous value to compute the actual value before inserting into mongo, you will of course have very unexpected values depending on the collector you reach :)

I would add to the matter that scaling the server itself is one good thing, but I was thinking that it would be even better to scale the computations themselves. With the cluster mode you improve your responsiveness with many clients, but the computations will still take a good amount of time individually.

Marsup avatar Mar 07 '13 14:03 Marsup