Fabian Groffen

Results 267 comments of Fabian Groffen

This is unrelated to the request, but do you think your workers can handle the load when one worker becomes unavailable? From a fail-over point of view, this feels very...

I think real HA means you'd have to do it on two nodes at the same time (double), because aggregations depend on state, which gets lost if the engine stops.

Indeed, my criticism aside, the problem is an implementation detail. In the past I used some technique to share queues of servers, perhaps I can use that to implement this...

I haven't had bright ideas yet on how to test the software, except from ripping it apart and trying unit-tests, but that would mean a major investment which I can't...

I've made a start with this in master. It isn't as extensive as one would hope, but it's something.

I have a random idea here, no idea if it works: ``` cluster foo fnv1a_ch ip-of-target-server 127.0.0.127:12345 127.0.0.128:12345 127.0.0.129:12345 127.0.0.130:12345 ; ``` Now ip-of-target-server should receive ~20% of traffic, while...

you could even use the aliasing system to work up your percentages more easily like ``` ip=a ip=b 127.0.0.1:12345=c 127.0.0.1:12345=d 127.0.0.1:12345=e ```

My first shot would be that this is a divide by zero: for some reason the bucket (60 seconds) didn't receive any values. If the problem happens often enough, it...

Ok, then I'd like to know if collectd is able to deliver the metrics to the relay, a tcpdump around a nan value should help here. Also, I just realised...

Right, it looks like there's a "garbage value" which ends up being NaN, and math on NaNs end up being NaN. Question right now is (given we can't fix why...