vcluster icon indicating copy to clipboard operation
vcluster copied to clipboard

Issues with setting up metricsServer proxy

Open PavelGloba opened this issue 9 months ago • 4 comments

What happened?

I'm not sure if it's working as expected or not, but the configuration and documentation are not exactly matching the required setup. I have a cluster with metrics server installed in the "metrics-server" namespace, instead of the "kube-system" namespace. I installed a fresh 0.24.1 vcluster and couldn't set up the metrics server proxy for a while. I found this issue and turns out I had to add a port 10250 to the egress policy in order to get things going. On the same host cluster I have a bunch of 0.19.7 which are working ok without opening this port and without any additional configuration. So, I suspect that on newer versions are somehow directly work with the metrics server's container port. Also, while investigating the issue I found out that this old issue is back if I'm specifying apiService while the port 10250 is closed by the network policy:

integrations:
  metricsServer:
    enabled: true
    apiService:
      service:
        name: metrics-server
        namespace: metrics-server
        port: 443

At the same time I'm able to delete namespaces if I'm not specifying apiService or disabling the metricsServer

What did you expect to happen?

I expect metrics server proxy to work with the metrics server service which is specified in the apiService configuration or the container port 10250 in the egress network policy should be mentioned in the documentation

How can we reproduce it (as minimally and precisely as possible)?

not needed

Anything else we need to know?

No response

Host cluster Kubernetes version

1.30

vcluster version

0.24.1

VCluster Config

# My vcluster.yaml / values.yaml here

PavelGloba avatar Apr 11 '25 00:04 PavelGloba

Hey @PavelGloba ! Thanks for creating this issue. We changed the implementation on how we connect to the metrics server from connecting to the host api server to connecting to the metrics server directly. We switched the implementation because the round trip through the host api server caused issues with open api / schema resolution and made things a lot more complex because of that.

Regarding the network policy its true that you now need to add an exception for this as previously the connection was only routed through the host api server instead of reaching the metrics server directly. Its interesting that you need to add the target port 10250 even though we only connect to the service directly (https://github.com/loft-sh/vcluster/blob/7b40fb76f696f74d588a311a31f5b87e8f179400/pkg/apiservice/generic.go#L157). We'll add that to the docs, but there is not really anything we can do against that unfortunately without switching back to the old implementation which caused other issues.

FabianKramm avatar Apr 16 '25 07:04 FabianKramm

We also added fields to easily configure this via #2690, we could even think about adding this automatically if metricsServer proxy is configured.

FabianKramm avatar Apr 16 '25 07:04 FabianKramm

Automatically would be good to fix https://github.com/loft-sh/vcluster/issues/2092 😅

reneleonhardt avatar Apr 16 '25 09:04 reneleonhardt

Hi I'm on version 0.28 and I deployed the vcluster like this integrations: metricsServer: apiService: service: name: metrics-server namespace: metrics-server port: 443

nevertheless, the vcluster logs show: syncer 2025-09-24 13:08:21 ERROR handler/handler.go:49 Error while proxying request: dial tcp: lookup **metrics-server.kube-system on 172.20.0.10:53**: no such host {"component": "vcluster"}

Please note metrics-server.kube-system on 172.20.0.10:53 . It looks like the namespace is not considered at all

irizzant avatar Sep 24 '25 13:09 irizzant