globalping icon indicating copy to clipboard operation
globalping copied to clipboard

Occasional re-registering of synced-probe-list nodes

Open alexey-yarmosh opened this issue 1 year ago • 4 comments

In New Relic there are many repeated logs like Removed node *. and Registered new node *.. That indicates that either the API server or Redis was too slow to update at the expected frequency.

Most likely that is the reason why some probes are offline on the dashboard from time to time: Worker can't sync with a external node -> removes it's nodes from fetchProbes -> AdoptedProbes syncs missing probes as offline.

alexey-yarmosh avatar Sep 23 '24 11:09 alexey-yarmosh

99% of the occurrences were caused by wrong redis configuration (memory overload), and the errors went close to zero since we fixed that.

The remaining occasional cases are caused on the node.js side by event loop blocking, which is hard to track down. It got better with #560 and #561, and there might be some other places in request handling causing the rest of the spikes, but since it hardly causes any issues at this point, I'm assigning low priority.

MartinKolarik avatar Nov 21 '24 16:11 MartinKolarik

Is this still happening after the latest changes and Redis upgrades?

jimaek avatar Mar 05 '25 15:03 jimaek

Ping. Can we close this?

jimaek avatar Sep 14 '25 13:09 jimaek

The remaining occasional cases are caused on the node.js side by event loop blocking, which is hard to track down. It got better with https://github.com/jsdelivr/globalping/pull/560 and https://github.com/jsdelivr/globalping/pull/561, and there might be some other places in request handling causing the rest of the spikes, but since it hardly causes any issues at this point, I'm assigning low priority.

This part is still relevant.

MartinKolarik avatar Sep 14 '25 13:09 MartinKolarik