hubble
hubble copied to clipboard
--namespace is incompatible with --pod
Currently, it is not possible to give --namespace along with --pod to hubble.
% hubble observe --from-namespace cilium --from-pod hubble-relay
invalid argument "hubble-relay" for "--pod" flag: filters --from-namespace and --pod cannot be combined
An alternative way to express this is the following:
% hubble observe --from-label io.kubernetes.pod.namespace=cilium --from-pod hubble-relay
The issue with this is that --from-pod hubble-relay is implicitly referring to the hubble-relay pod in the default namespace. Here I would expect to see the the Flows from the cilium/hubble-relay pod.
cc @michi-covalent @gandro @nathanjsweet
Namespace was implemented as kind of a shortcut for pod name. We can re-write the implementation to use the label instead of pod. https://github.com/cilium/hubble/blob/master/cmd/observe/observe_filter.go#L271
@glibsm thanks for this pointer. I think we also have to take --service into account, e.g. hubble observe --from-namespace cilium --from-service hubble-relay should work.
I could write a proposition of behaviour changes for the CLI by next week, and if approved we can get started on a PR. Sounds good?
@kAworu sounds good to me ship it 💯
I'm not sure what the issue is with from-ns and from-service, but at least they are not conflicting out of the box:
https://github.com/cilium/hubble/blob/fe34f3d33be9199ce638fafafdbf921e2db02aca/cmd/observe/observe_filter.go#L112
I think we have consensus that we can change the CLI implementation of --{from-,to-,}namespace to be a label filter instead of a pod filter instead.
For me, the big question is how to make it so that --pod foo matches any pod with name foo in any namespace (and is not just a shortcut for default/foo). I see client-side and server-side solutions.
The easiest client-side fix I can think of is that we when --{from-,to-,}namespace ns are present, the client always adds ns/ to the pod filter (if present). So the CLI flags --namespace foo --to-pod bar imply a label-filter on io.kubernetes.pod.namespace=cilium and a destination_pod filter for foo/bar.
There are also server-side solutions, e.g. (1) doing a breaking change where the server never adds the default/ namespace, or (2) add a new pod filter that doesn't care about namespaces, or (2) do a server-side version of label-filter modifies pod filter logic. I'm not particularly fond of any server-side solutions I can think of right now.
We may need something analogous to --all-namespaces in kubectl. It's not like you can get pods by name across all namespaces, unless you check all namespaces. Then if only the pod name is provided (we can check for / presence), we'd issue request across all namespaces only for the pod name.
I haven't looked into the feasibility of such approach yet.
We may need something analogous to
--all-namespacesin kubectl
Very good point. It's unfortunate that in the current version, some filters like --label, --ip, --fqdn are querying all namespaces by default, while others (e.g. --pod and --service) can only ever query a single namespace (and use default if nothing is specified).
This makes me think if we should just bite the bullet and introduce a breaking change to the server-side parsing of pod and service filters, where they do not imply the default namespace if no namespace was provided.
This makes me think if we should just bite the bullet and introduce a breaking change to the server-side parsing of pod and service filters, where they do not imply the default namespace if no namespace was provided.
Technically, yes. In practice, I don't know how breaking of a change that is, actually.
If someone knows the precise name of the pod it's unlikely that very name exists in multiple namespaces.
Propositions
At a high level, make hubble cli more "intuitive" by behaving like kubectl with respect to namespace and pod selector flags.
1. cli: --namespace along with --pod
hubble observe --from-namespace cilium --from-pod hubble-relay
Doesn't work at the moment, we get:
invalid argument "hubble-relay" for "--from-pod" flag: filters --from-namespace and --from-pod cannot be combined
proposition
Translate this into a [source_pod:"cilium/hubble-relay"] FlowFilter. This is analog to
kubectl --namespace cilium --pod hubble-relay
Fail with an error when we get an inconsistency between the namespace arg and pod arg, e.g. --from-namespace cilium --from-pod default/hubble-relay.
Also apply for destinations and services.
2. cli: handle the namespace label specifically
hubble observe --from-label io.kubernetes.pod.namespace=cilium --from-pod hubble-relay
This is translated as [source_pod:"hubble-relay" source_label:"io.kubernetes.pod.namespace=cilium"] which won't ever yield any flow as the source_pod part is equivalent to default/hubble-relay.
proposition
translate this into either:
[source_pod:"cilium/hubble-relay"][source_pod:"cilium/hubble-relay", source_label:"io.kubernetes.pod.namespace=cilium"]
Additionally, lookout for inconsistencies between --from-label io.kubernetes.pod.namespace=cilium, --from-namespace, and --from-pod.
Also apply for destinations and services.
3. cli: namespace consistency between --from-pod and --from-service
hubble observe --from-pod kube-system/hubble-relay --from-service default/hubble-relay
currently translate into [source_pod:"kube-system/hubble-relay" source_service:"default/hubble-relay"] which won't ever yield any flow.
proposition
Fail with an error when we get an inconsistency between the service arg and pod arg
Also apply for destinations.
4. FlowFilter
unlike the other proposition, this one will require some changes on the Hubble server side.
Force clients to always use the form namespace/pod for source_pod, destination_pod. This mean that we'll add the default/ part on some queries, like hubble observe --from-pod hubble-relay. The form without slash (implicit default/) can be deprecated and kept for a while for backward compatibility's sake.
An empty name will act as wildcard. We already use this:
hubble observe --from-namespace cilium
Is currently translated into [source_pod:"cilium/"] which mean "every pod under the cilium namespace".
With that scheme it is easy to design a --all-namespace flag akin to kubectl as @glibsm suggested:
hubble observe --from-all-namespace --from-pod hubble-relay
Will be translated into [source_pod:"/hubble-relay"]
To query all pods in all namespaces, we would have [source_pod:"/"]. As for the implicit default/ we can keep the current FlowFilter [] compatible for a while.
Also apply for destinations and services.
@kAworu All of this makes a lot of sense to me. I especially like that you came up with a backward-compatible solution for --all-namespaces in 4!
Minor comment:
I strongly believe that hubble observe --from-label io.kubernetes.pod.namespace=cilium --from-pod hubble-relay should be translated to [source_pod:"cilium/hubble-relay", source_label:"io.kubernetes.pod.namespace=cilium"], i.e. the label filter should be preserved it the user specified it to avoid any surprises.
As much as I know this is a hard ask, It would also be ideal if the Flow protobuf separated out the namespace from the pod name field. Then filtering in hubble would be simpler, and it would make it easier for SIEM/logging solutions to ingest the logs and filter by namespace in queries, as today it's a bit complicated to query by namespace in tools ingesting JSON flow logs.
in the Flow protobuf itself namespace and pod_name are already separate fields right? i thought the issue was the way we defined the namespace/pod filter syntax: https://github.com/cilium/cilium/blob/master/pkg/hubble/filters/k8s.go#L36
in the
Flowprotobuf itselfnamespaceandpod_nameare already separate fields right? i thought the issue was the way we defined the namespace/pod filter syntax: https://github.com/cilium/cilium/blob/master/pkg/hubble/filters/k8s.go#L36
Ah you're right. I guess I assumed they were the same! Good to know.
As an addition to @kaworu 4th option, we can also add a custom user-agent/header to our hubble CLI gRPC queries and the server can use that to determine if it should use the implicit default/ namespace for older versions. Newer versions will specify their hubble CLI version in the headers so the server could then do the new behavior for new clients.
I started looking into this and started an MVP for option 4. So far I think everything seems to mostly work as expected. At the moment I believe the behavior is completely backwards compatible, because the new behavior is only in effect when using the pod and namespace filters together, which was previously not allowed via CLI.
Branches:
- Cilium: https://github.com/cilium/cilium/tree/pr/chancez/flow_filter_namespace
- Hubble CLI: https://github.com/cilium/hubble/tree/pr/chancez/remove_namespace_pod_filter_conflict
The main thing I think is worth noting is that if you only specify the pod filters, then it uses the existing behavior, and if you specify namespace filters it will ignore the namespace portion of the pod filters. Additionally, when you use the namespace flags with the Hubble CLI, it sets both pod filters, and namespace filters. Combined, this means that the server uses the new behavior only when both pod and namespace filters are specified together. The server never supported namespace flow filters prior to this change, so the new filters will be ignored on old versions of Cilium, meaning the old behavior will be retained. If you try using both --pod and --namespace together on an older version of Hubble, then it will basically ignore the --namespace and act as --pod default/<pod_name>.
As a future step, the client and server could negotiate to determine which behavior to use, or the server could simply deprecate the use of namespace in pod filters, and eventually remove it, leaving only the new behavior.
Some example raw filter output:
$ ./hubble observe --to-namespace kube-system --to-pod hubble-relay --print-raw-filters
allowlist:
- '{"destination_pod":["kube-system/","hubble-relay"],"destination_namespace":["kube-system"]}'
./hubble observe --to-namespace kube-system --print-raw-filters pr/chancez/remove_namespace_pod_filter_conflict ✚
allowlist:
- '{"destination_pod":["kube-system/"],"destination_namespace":["kube-system"]}'
$ ./hubble observe --to-pod hubble-relay --print-raw-filters
allowlist:
- '{"destination_pod":["hubble-relay"]}'
$ ./hubble observe --to-pod kube-system/hubble-relay --print-raw-filters
allowlist:
- '{"destination_pod":["kube-system/hubble-relay"]}'
And agent logs for each with some extra printlns indicating "how" it's filtering, using the some commands as above:
2023-01-14T00:59:06.001687308Z filter by podName [hubble-relay]
2023-01-14T00:59:06.001703933Z filter by namespace [kube-system]
2023-01-14T00:59:06.002010102Z level=debug msg="GetFlows finished" blacklist="{}" buffer_size=4095 number_of_flows=20 subsys=hubble took="313.211µs" whitelist="{destination_pod:\"kube-system/\" destination_pod:\"hubble-relay\" destination_namespace:\"kube-system\"}"
2023-01-14T00:59:15.442809199Z filter by namespace [kube-system]
2023-01-14T00:59:15.442984076Z level=debug msg="GetFlows finished" blacklist="{}" buffer_size=4095 number_of_flows=20 subsys=hubble took="159.501µs" whitelist="{destination_pod:\"kube-system/\" destination_namespace:\"kube-system\"}"
2023-01-14T00:59:22.823465864Z filter by NamespacedName: [{default hubble-relay}]
2023-01-14T00:59:22.824444413Z level=debug msg="GetFlows finished" blacklist="{}" buffer_size=4095 number_of_flows=0 subsys=hubble took=1.131509ms whitelist="{destination_pod:\"hubble-relay\"}"
2023-01-14T00:59:44.588980112Z filter by NamespacedName: [{kube-system hubble-relay}]
2023-01-14T00:59:44.589282490Z level=debug msg="GetFlows finished" blacklist="{}" buffer_size=4095 number_of_flows=20 subsys=hubble took="216.043µs" whitelist="{destination_pod:\"kube-system/hubble-relay\"}"
After talking with Alex, I might try the CLI focused approach(es).
I'd be happy to work on this
Re-opening as there is still https://github.com/cilium/cilium/pull/28921 and its client implementation related to this issue.