kubernetes icon indicating copy to clipboard operation
kubernetes copied to clipboard

TOB-K8S-024: kubelet liveness probes can be used to enumerate host network

Open cji opened this issue 6 years ago • 17 comments

This issue was reported in the Kubernetes Security Audit Report

Description Kubernetes supports both readiness and liveness probes to detect when a Pod is operating correctly, and when to begin or stop directing traffic to a Pod. Three methods are available to facilitate these probes: command execution, HTTP, and TCP.

Using the HTTP and TCP probes, it is possible for an operator with limited access to the cluster (purely kubernetes-service related access) to enumerate the underlying host network. This is possible due to the scope in which these probes execute. Unlike the command execution probe, which will execute a command within the container, the TCP and HTTP probes execute from the context of the kubelet process. Thus, host networking interfaces are used, and the operator is now able to specify hosts which may not be available to Pods kubelet is managing.

The enumeration of the host network uses the container’s health and readiness to determine the status of the remote host. If the pod is killed and restarted due to a failed liveness probe, this indicates that the host is inaccessible. If the Pod successfully passes the liveness check and is presented as ready, the host is accessible. These two states create boolean states of accessible and inaccessible hosts to the underlying host running kubelet.

Additionally, an attacker can append headers through the Pod specification, which are interpreted by the Go HTTP library as authentication or additional request headers. This can allow an attacker to abuse liveness probes to access a wider-range of cluster resources.

An example Pod file that can enumerate the host network is available in Appendix E.

Exploit Scenario Alice configures a cluster which restricts communications between services on the cluster. Eve gains access to Alice’s cluster, and subsequently submits many Pods enumerating the host network in an attempt to gain information about Alice’s underlying host network.

Recommendations Short term, restrict the kubelet in a way that prevents the kubelet from probing hosts it does not manage directly.

Long term, consider restricting probes to the container runtime, allowing liveness to be determined within the scope of the container-networking interface.

Anything else we need to know?:

See #81146 for current status of all issues created from these findings.

The vendor gave this issue an ID of TOB-K8S-024 and it was finding 21 of the report.

The vendor considers this issue Medium Severity.

To view the original finding, begin on page 60 of the Kubernetes Security Review Report

Environment:

  • Kubernetes version: 1.13.4

cji avatar Aug 08 '19 02:08 cji

/sig node /area security /kind bug /remove-kind feature

joelsmith avatar Aug 08 '19 05:08 joelsmith

Preventing pods without hostNetwork from probing other hosts should be sufficient. Pods with hostNetwork can already probe the host. PSP controls hostNetwork.

smarterclayton avatar Aug 10 '19 16:08 smarterclayton

In a previous discussion, we had discussed marking the Host fields of HTTPGetAction and TCPSocketAction as deprecated. It looks like that never actually happened.

/cc @thockin

tallclair avatar Aug 12 '19 20:08 tallclair

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Nov 10 '19 21:11 fejta-bot

/remove-lifecycle stale

neolit123 avatar Nov 10 '19 21:11 neolit123

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Feb 08 '20 21:02 fejta-bot

/lifecycle frozen

cji avatar Feb 09 '20 14:02 cji

Nice one!

/triage accepted /priority important-longterm /assign

matthyx avatar Jun 25 '21 11:06 matthyx

In a previous discussion, we had discussed marking the Host fields of HTTPGetAction and TCPSocketAction as deprecated. It looks like that never actually happened.

/cc @thockin

@thockin what do you think about this proposal from @tallclair

matthyx avatar Aug 04 '21 12:08 matthyx

/assign @thockin

matthyx avatar Sep 06 '21 14:09 matthyx

This fell into a lot of cracks.

Deprecating the host field seems reasonable, but I don't think it addresses the problem, unless I am missing something?

We could, as Clayton suggests, limit use of this field to hostNetwork pods - either at API time (preferred) or at runtime. But it would have to be opt-in (e.g. another admission controller like externalIPs).

Technically this is not difficult - is it sufficient?

thockin avatar Feb 19 '22 00:02 thockin

@thockin glad you're back into this, let me prep an answer on Monday. Enjoy your weekend!

matthyx avatar Feb 19 '22 21:02 matthyx

Deprecating the host field seems reasonable, but I don't think it addresses the problem, unless I am missing something?

Indeed, it would be like killing a fly with a sledgehammer...

We could, as Clayton suggests, limit use of this field to hostNetwork pods - either at API time (preferred) or at runtime. But it would have to be opt-in (e.g. another admission controller like externalIPs).

Ok, works for me, I think if should be sufficient to mitigate the vulnerability. I guess we'll need a KEP for that?

matthyx avatar Feb 21 '22 07:02 matthyx

I'd be open to discussing whether this would be a fit for the Pod Security Standards and inclusion in Pod Security admission by extension. IMO this is hovering around the border of the scope of the pod security standards, but they do already have opinions on HostNetwork, HostPorts, and CAP_NET_RAW.

Any core solution to this is going to need a KEP.

tallclair avatar Feb 23 '22 01:02 tallclair

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-triage-robot avatar Feb 23 '23 01:02 k8s-triage-robot

/triage accepted

This is still relevant.

tallclair avatar Feb 24 '23 01:02 tallclair

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-triage-robot avatar Feb 24 '24 01:02 k8s-triage-robot