terraform-provider-kubernetes icon indicating copy to clipboard operation
terraform-provider-kubernetes copied to clipboard

Don't fallback to localhost cluster

Open ikarlashov opened this issue 3 years ago • 10 comments

Hi folks,

We have Gitlab-CI runners running pipeline in eks cluster. Whenever k8s provider can't establish connection to desired cluster through k8s provider config block - it falls back to localhost and trying to mess up cluster where pipeline runs. It's very dangerous behavior and should be enabled EXPLICITLY in k8s provider settings (if there's a real usecase for it).

Terraform Version, Provider Version and Kubernetes Version

Terraform version: 1.0.1
Kubernetes provider version: 2.6.1
Kubernetes version: 1.19

Affected Resource(s)

Authentication mechanism for provider

Debug Output

Fallback to localhost: https://gist.github.com/ikarlashov/7af79c1225e9383bd6ca135cca2e0aa3

Steps to Reproduce

Misconfigured settings for k8s in k8s provider block

Expected Behavior

Fail and error message (like it does when runs in non-k8s enviro)

Actual Behavior

Trying to mess up wrong cluster

ikarlashov avatar Nov 02 '21 14:11 ikarlashov

Thanks for opening this @ikarlashov. This seems to be the default behaviour of client-go (we don't set any explicit configuration for InCluster config, it's just what happens if no options are specified and the client is inside a cluster). I need to investigate if there is a way to disable this and make it configurable.

Is the KUBERNETES_MASTER environment being set in the pod you are running Terraform in? A workaround here may be to unset that variable before Terraform runs.

jrhouston avatar Nov 10 '21 06:11 jrhouston

@jrhouston no problem :)

I don't think there's such env variable. I execed to the gitlab-runner pod and there're the following k8s-related vars:

KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT=tcp://172.20.0.1:443
KUBERNETES_SERVICE_PORT=443
FF_USE_LEGACY_KUBERNETES_EXECUTION_STRATEGY=false
KUBERNETES_SERVICE_HOST=172.20.0.1
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP_ADDR=172.20.0.1
KUBERNETES_PORT_443_TCP=tcp://172.20.0.1:443

ikarlashov avatar Nov 10 '21 10:11 ikarlashov

@ikarlashov Can you share some more information about how you are configuring the provider block in your Terraform config? After investigating it seems like you shouldn't fall back to the in-cluster config unless the provider block ends up with empty values.

jrhouston avatar Nov 10 '21 17:11 jrhouston

Looks like client-go uses KUBERNETES_SERVICE_PORT and KUBERNETES_SERVICE_HOST to get the in-cluster config here. You could try unsetting those as a workaround for now.

jrhouston avatar Nov 10 '21 20:11 jrhouston

facing the same issue in our own environment. Kubernetes provider is working fine with tf 0.13 but not at tf 1.0.x. It is falling back to localhost. Our cluster is AWS EKS.

Used configurations :

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
  
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws-iam-authenticator"
    args = [
      "token",
      "-i",
      aws_eks_cluster.main.name,
      "--role",
      xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,
    ]
  }
}

Error : Error: Get "http://localhost/api/v1/namespaces/xxxxxxxx": dial tcp 127.0.0.1:80: connect: connection refused

chandankashyap19 avatar Dec 01 '21 14:12 chandankashyap19

In our case, it even tries to connect to a completly different service (NoMachine webinterface). Because it runs on localhost and has a redirect. And that even when the cluster endpoint is available.

Get "https://127.0.0.1/nxwebplayer": x509: cannot validate certificate for 127.0.0.1 because it doesn't contain any IP SANs
with module.eks.module.eks.kubernetes_config_map.aws_auth[0],
on .terraform/modules/eks.eks/main.tf line 298, in resource "kubernetes_config_map" "aws_auth":
298: resource "kubernetes_config_map" "aws_auth"

On Helm it says the following. Somehow the configuration went missing.

Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

simwak avatar Mar 22 '22 19:03 simwak

Hi Team - We have observed the related issue when the provider "kubernetes" {} block is omitted, resulting in the unexpected behavior of the provider attempting to contact localhost. For a UX standpoint an invalid configuration error or warning for omitted values would be strongly preferable rather than silently falling back to localhost.

Terraform version: 1.1.6 Kubernetes provider version: v2.10.0_x5

apeabody avatar Apr 08 '22 22:04 apeabody

How does one check the kubernetes configuration? https://www.reddit.com/r/Terraform/comments/vsme03/how_do_i_verify_the_kubernetes_provider/if25eb2/?context=3

kaihendry avatar Jul 06 '22 13:07 kaihendry

I have a similar issue with kubernetes provider on a different cloud provider: the interesting part is that the provider config works fine on first run, but then on subsequent plan or apply, it fails with this issue. It seems like it is just not re-runing the 'exec' block, therefore there is no config / no token, and it defaults to nothing which somehow turns into localhost

the problem lies in the exec block issue though...

streamnsight avatar Mar 09 '23 22:03 streamnsight

hello here! any news about this issue? it looks like I hit the same as described here and in #2127

casualuser avatar Nov 25 '23 23:11 casualuser