hubble icon indicating copy to clipboard operation
hubble copied to clipboard

--namespace is incompatible with --pod

Open kaworu opened this issue 5 years ago • 13 comments

Currently, it is not possible to give --namespace along with --pod to hubble.

% hubble observe --from-namespace cilium --from-pod hubble-relay
invalid argument "hubble-relay" for "--pod" flag: filters --from-namespace and --pod cannot be combined

An alternative way to express this is the following:

% hubble observe --from-label io.kubernetes.pod.namespace=cilium --from-pod hubble-relay

The issue with this is that --from-pod hubble-relay is implicitly referring to the hubble-relay pod in the default namespace. Here I would expect to see the the Flows from the cilium/hubble-relay pod.

cc @michi-covalent @gandro @nathanjsweet

kaworu avatar Sep 08 '20 16:09 kaworu

Namespace was implemented as kind of a shortcut for pod name. We can re-write the implementation to use the label instead of pod. https://github.com/cilium/hubble/blob/master/cmd/observe/observe_filter.go#L271

glibsm avatar Sep 08 '20 16:09 glibsm

@glibsm thanks for this pointer. I think we also have to take --service into account, e.g. hubble observe --from-namespace cilium --from-service hubble-relay should work.

I could write a proposition of behaviour changes for the CLI by next week, and if approved we can get started on a PR. Sounds good?

kaworu avatar Sep 08 '20 17:09 kaworu

@kAworu sounds good to me ship it 💯

michi-covalent avatar Sep 08 '20 17:09 michi-covalent

I'm not sure what the issue is with from-ns and from-service, but at least they are not conflicting out of the box: https://github.com/cilium/hubble/blob/fe34f3d33be9199ce638fafafdbf921e2db02aca/cmd/observe/observe_filter.go#L112

glibsm avatar Sep 08 '20 17:09 glibsm

I think we have consensus that we can change the CLI implementation of --{from-,to-,}namespace to be a label filter instead of a pod filter instead.

For me, the big question is how to make it so that --pod foo matches any pod with name foo in any namespace (and is not just a shortcut for default/foo). I see client-side and server-side solutions.

The easiest client-side fix I can think of is that we when --{from-,to-,}namespace ns are present, the client always adds ns/ to the pod filter (if present). So the CLI flags --namespace foo --to-pod bar imply a label-filter on io.kubernetes.pod.namespace=cilium and a destination_pod filter for foo/bar.

There are also server-side solutions, e.g. (1) doing a breaking change where the server never adds the default/ namespace, or (2) add a new pod filter that doesn't care about namespaces, or (2) do a server-side version of label-filter modifies pod filter logic. I'm not particularly fond of any server-side solutions I can think of right now.

gandro avatar Sep 09 '20 09:09 gandro

We may need something analogous to --all-namespaces in kubectl. It's not like you can get pods by name across all namespaces, unless you check all namespaces. Then if only the pod name is provided (we can check for / presence), we'd issue request across all namespaces only for the pod name.

I haven't looked into the feasibility of such approach yet.

glibsm avatar Sep 09 '20 16:09 glibsm

We may need something analogous to --all-namespaces in kubectl

Very good point. It's unfortunate that in the current version, some filters like --label, --ip, --fqdn are querying all namespaces by default, while others (e.g. --pod and --service) can only ever query a single namespace (and use default if nothing is specified).

This makes me think if we should just bite the bullet and introduce a breaking change to the server-side parsing of pod and service filters, where they do not imply the default namespace if no namespace was provided.

gandro avatar Sep 10 '20 08:09 gandro

This makes me think if we should just bite the bullet and introduce a breaking change to the server-side parsing of pod and service filters, where they do not imply the default namespace if no namespace was provided.

Technically, yes. In practice, I don't know how breaking of a change that is, actually.

If someone knows the precise name of the pod it's unlikely that very name exists in multiple namespaces.

glibsm avatar Sep 10 '20 15:09 glibsm

Propositions

At a high level, make hubble cli more "intuitive" by behaving like kubectl with respect to namespace and pod selector flags.

1. cli: --namespace along with --pod

hubble observe --from-namespace cilium --from-pod hubble-relay

Doesn't work at the moment, we get:

invalid argument "hubble-relay" for "--from-pod" flag: filters --from-namespace and --from-pod cannot be combined

proposition

Translate this into a [source_pod:"cilium/hubble-relay"] FlowFilter. This is analog to

kubectl --namespace cilium --pod hubble-relay

Fail with an error when we get an inconsistency between the namespace arg and pod arg, e.g. --from-namespace cilium --from-pod default/hubble-relay.

Also apply for destinations and services.

2. cli: handle the namespace label specifically

hubble observe --from-label io.kubernetes.pod.namespace=cilium --from-pod hubble-relay

This is translated as [source_pod:"hubble-relay" source_label:"io.kubernetes.pod.namespace=cilium"] which won't ever yield any flow as the source_pod part is equivalent to default/hubble-relay.

proposition

translate this into either:

  • [source_pod:"cilium/hubble-relay"]
  • [source_pod:"cilium/hubble-relay", source_label:"io.kubernetes.pod.namespace=cilium"]

Additionally, lookout for inconsistencies between --from-label io.kubernetes.pod.namespace=cilium, --from-namespace, and --from-pod.

Also apply for destinations and services.

3. cli: namespace consistency between --from-pod and --from-service

hubble observe --from-pod kube-system/hubble-relay --from-service default/hubble-relay

currently translate into [source_pod:"kube-system/hubble-relay" source_service:"default/hubble-relay"] which won't ever yield any flow.

proposition

Fail with an error when we get an inconsistency between the service arg and pod arg

Also apply for destinations.

4. FlowFilter

unlike the other proposition, this one will require some changes on the Hubble server side.

Force clients to always use the form namespace/pod for source_pod, destination_pod. This mean that we'll add the default/ part on some queries, like hubble observe --from-pod hubble-relay. The form without slash (implicit default/) can be deprecated and kept for a while for backward compatibility's sake.

An empty name will act as wildcard. We already use this:

hubble observe --from-namespace cilium

Is currently translated into [source_pod:"cilium/"] which mean "every pod under the cilium namespace".

With that scheme it is easy to design a --all-namespace flag akin to kubectl as @glibsm suggested:

hubble observe --from-all-namespace --from-pod hubble-relay

Will be translated into [source_pod:"/hubble-relay"]

To query all pods in all namespaces, we would have [source_pod:"/"]. As for the implicit default/ we can keep the current FlowFilter [] compatible for a while.

Also apply for destinations and services.

kaworu avatar Oct 01 '20 15:10 kaworu

@kAworu All of this makes a lot of sense to me. I especially like that you came up with a backward-compatible solution for --all-namespaces in 4!

Minor comment: I strongly believe that hubble observe --from-label io.kubernetes.pod.namespace=cilium --from-pod hubble-relay should be translated to [source_pod:"cilium/hubble-relay", source_label:"io.kubernetes.pod.namespace=cilium"], i.e. the label filter should be preserved it the user specified it to avoid any surprises.

gandro avatar Oct 05 '20 10:10 gandro

As much as I know this is a hard ask, It would also be ideal if the Flow protobuf separated out the namespace from the pod name field. Then filtering in hubble would be simpler, and it would make it easier for SIEM/logging solutions to ingest the logs and filter by namespace in queries, as today it's a bit complicated to query by namespace in tools ingesting JSON flow logs.

chancez avatar Aug 24 '22 18:08 chancez

in the Flow protobuf itself namespace and pod_name are already separate fields right? i thought the issue was the way we defined the namespace/pod filter syntax: https://github.com/cilium/cilium/blob/master/pkg/hubble/filters/k8s.go#L36

michi-covalent avatar Aug 24 '22 19:08 michi-covalent

in the Flow protobuf itself namespace and pod_name are already separate fields right? i thought the issue was the way we defined the namespace/pod filter syntax: https://github.com/cilium/cilium/blob/master/pkg/hubble/filters/k8s.go#L36

Ah you're right. I guess I assumed they were the same! Good to know.

chancez avatar Aug 25 '22 15:08 chancez

As an addition to @kaworu 4th option, we can also add a custom user-agent/header to our hubble CLI gRPC queries and the server can use that to determine if it should use the implicit default/ namespace for older versions. Newer versions will specify their hubble CLI version in the headers so the server could then do the new behavior for new clients.

chancez avatar Jan 13 '23 21:01 chancez

I started looking into this and started an MVP for option 4. So far I think everything seems to mostly work as expected. At the moment I believe the behavior is completely backwards compatible, because the new behavior is only in effect when using the pod and namespace filters together, which was previously not allowed via CLI.

Branches:

  • Cilium: https://github.com/cilium/cilium/tree/pr/chancez/flow_filter_namespace
  • Hubble CLI: https://github.com/cilium/hubble/tree/pr/chancez/remove_namespace_pod_filter_conflict

The main thing I think is worth noting is that if you only specify the pod filters, then it uses the existing behavior, and if you specify namespace filters it will ignore the namespace portion of the pod filters. Additionally, when you use the namespace flags with the Hubble CLI, it sets both pod filters, and namespace filters. Combined, this means that the server uses the new behavior only when both pod and namespace filters are specified together. The server never supported namespace flow filters prior to this change, so the new filters will be ignored on old versions of Cilium, meaning the old behavior will be retained. If you try using both --pod and --namespace together on an older version of Hubble, then it will basically ignore the --namespace and act as --pod default/<pod_name>.

As a future step, the client and server could negotiate to determine which behavior to use, or the server could simply deprecate the use of namespace in pod filters, and eventually remove it, leaving only the new behavior.

Some example raw filter output:

$ ./hubble observe --to-namespace kube-system --to-pod hubble-relay --print-raw-filters
allowlist:
- '{"destination_pod":["kube-system/","hubble-relay"],"destination_namespace":["kube-system"]}'
./hubble observe --to-namespace kube-system --print-raw-filters                                                                                                                                          pr/chancez/remove_namespace_pod_filter_conflict ✚
allowlist:
- '{"destination_pod":["kube-system/"],"destination_namespace":["kube-system"]}'
$ ./hubble observe --to-pod hubble-relay --print-raw-filters
allowlist:
- '{"destination_pod":["hubble-relay"]}'
$ ./hubble observe --to-pod kube-system/hubble-relay --print-raw-filters
allowlist:
- '{"destination_pod":["kube-system/hubble-relay"]}'

And agent logs for each with some extra printlns indicating "how" it's filtering, using the some commands as above:

2023-01-14T00:59:06.001687308Z filter by podName [hubble-relay]
2023-01-14T00:59:06.001703933Z filter by namespace [kube-system]
2023-01-14T00:59:06.002010102Z level=debug msg="GetFlows finished" blacklist="{}" buffer_size=4095 number_of_flows=20 subsys=hubble took="313.211µs" whitelist="{destination_pod:\"kube-system/\" destination_pod:\"hubble-relay\" destination_namespace:\"kube-system\"}"
2023-01-14T00:59:15.442809199Z filter by namespace [kube-system]
2023-01-14T00:59:15.442984076Z level=debug msg="GetFlows finished" blacklist="{}" buffer_size=4095 number_of_flows=20 subsys=hubble took="159.501µs" whitelist="{destination_pod:\"kube-system/\" destination_namespace:\"kube-system\"}"
2023-01-14T00:59:22.823465864Z filter by NamespacedName: [{default hubble-relay}]
2023-01-14T00:59:22.824444413Z level=debug msg="GetFlows finished" blacklist="{}" buffer_size=4095 number_of_flows=0 subsys=hubble took=1.131509ms whitelist="{destination_pod:\"hubble-relay\"}"
2023-01-14T00:59:44.588980112Z filter by NamespacedName: [{kube-system hubble-relay}]
2023-01-14T00:59:44.589282490Z level=debug msg="GetFlows finished" blacklist="{}" buffer_size=4095 number_of_flows=20 subsys=hubble took="216.043µs" whitelist="{destination_pod:\"kube-system/hubble-relay\"}"

chancez avatar Jan 14 '23 00:01 chancez

After talking with Alex, I might try the CLI focused approach(es).

chancez avatar Jan 19 '23 16:01 chancez

I'd be happy to work on this

glrf avatar Oct 25 '23 15:10 glrf

Re-opening as there is still https://github.com/cilium/cilium/pull/28921 and its client implementation related to this issue.

kaworu avatar Nov 28 '23 08:11 kaworu