website icon indicating copy to clipboard operation
website copied to clipboard

GKE private cluster blocks tap ports by default

Open tkh opened this issue 5 years ago • 1 comments

Bug Report

What is the issue?

Starting with a fresh cluster on GKE today with private nodes and installing linkerd via CLI installs fine but tap is blocked by the GCP firewall.

How can it be reproduced?

  • Create GKE cluster with private nodes
  • Install linkerd via CLI script
  • Attempt to run tap or use it via dashboard:
$ linkerd tap deployment/tap --namespace linkerd
Error: HTTP error, status Code [503] (unexpected API response: service unavailable
)
Usage:
  linkerd tap [flags] (RESOURCE)
...
Screenshot 2019-08-23 at 20 01 38

Logs, error output, etc

(If the output is long, please create a gist and paste the link here.)

linkerd check output

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ control plane PodSecurityPolicies exist

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ control plane replica sets are ready
√ no unschedulable pods
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-api
-----------
√ control plane pods are ready
√ control plane self-check
√ [kubernetes] control plane can talk to Kubernetes
√ [prometheus] control plane can talk to Prometheus
√ no invalid service profiles

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

control-plane-version
---------------------
√ control plane is up-to-date
√ control plane and cli versions match

Status check results are √

Environment

  • Kubernetes Version: v1.13.7-gke.19
  • Cluster Environment: GKE
  • Host OS: COS
  • Linkerd version: v2.5.0

Possible solution

Opening apiserver port in GCP firewall for the network.

Additional context

I spoke to @grampelberg earlier today for help debugging this and he figured it out. I was able to open port 8089/TCP which matched apiserver on the tap pod.

Looks like this is unique to GKE and private clusters and this a documentation need for GKE rather than a bug. The steps here with the adjusted port for tap solves this issue for the dashboard. CLI gets past the failure but I haven't adjusted for RBAC to go further there.

tkh avatar Aug 23 '19 20:08 tkh

This section solved the issue for me without creating another firewall rule for tap

edited: I actually still have to create a GCP firewall rule for the tap port 8089 in order to make it work

smeeklai avatar Sep 04 '19 09:09 smeeklai