hcloud-cloud-controller-manager icon indicating copy to clipboard operation
hcloud-cloud-controller-manager copied to clipboard

Add optional cache for cluster with many nodes

Open jooola opened this issue 5 months ago • 1 comments

TL;DR

In a cluster with many nodes (> 200), the status check will requests each servers by ID every 5 minutes. While this seem benign on smaller cluster, it does exhaust the API request budget of larger clusters.

Having a way to cache the list of servers for e.g. 10s or 30s (user config?) and check the status of a node in that pool, could already cut the amount of requests by a lot.

Expected behavior

Allow to cache a list of servers for a user defined amount of time, to reduce the amount of API requests made by the cloud controller manager, while checking the status of a node.

jooola avatar Sep 24 '25 07:09 jooola

I think this is solved as part of the hetzner-k3s project here, maybe you can get some inspiration:

https://github.com/vitobotta/hetzner-k3s/blob/main/docs/Recommendations.md#large-cluster-architecture-since-v228

"IP Query Server: A simple container that checks the Hetzner API every 30 seconds to get the list of all node IPs"

deubert-it avatar Oct 13 '25 10:10 deubert-it