calico icon indicating copy to clipboard operation
calico copied to clipboard

VM access was blocked when eBPF dataplane used

Open TrevorTaoARM opened this issue 3 years ago • 5 comments

When I enabled the Calico eBPF dataplane for a K8s cluster, the VMs(for which the NIC was bridged on the physical NIC of the server) on the node which had been configured with the eBPF dataplane can't be accessed with normal ssh access. When the kube-proxy was restored and eBPF DP disabled, the SSH access to VM was also restored.

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

The following script was used to enable eBPF dataplane: #!/bin/bash set -x

WORKDIR=$(pwd) TMP_DIR=$(mktemp -d) MARCH=$(uname -m) CALICO_VERSION=${1:-3.23.2}

if [ $MARCH == "aarch64" ]; then ARCH=arm64; elif [ $MARCH == "x86_64" ]; then ARCH=amd64; else ARCH="unknown"; fi echo ARCH=$ARCH

k8s_ep=$(kubectl get endpoints kubernetes -o wide | grep kubernetes | cut -d " " -f 4) k8s_host=$(echo $k8s_ep | cut -d ":" -f 1) k8s_port=$(echo $k8s_ep | cut -d ":" -f 2)

cat <<EOF > ${WORKDIR}/k8s_service.yaml kind: ConfigMap apiVersion: v1 metadata: name: kubernetes-services-endpoint namespace: kube-system data: KUBERNETES_SERVICE_HOST: "KUBERNETES_SERVICE_HOST" KUBERNETES_SERVICE_PORT: "KUBERNETES_SERVICE_PORT" EOF sed -i "s/KUBERNETES_SERVICE_HOST/${k8s_host}/" ${WORKDIR}/k8s_service.yaml sed -i "s/KUBERNETES_SERVICE_PORT/${k8s_port}/" ${WORKDIR}/k8s_service.yaml kubectl apply -f ${WORKDIR}/k8s_service.yaml

echo "Disable kube-proxy:" kubectl patch ds -n kube-system kube-proxy -p '{"spec":{"template":{"spec":{"nodeSelector":{"non-calico": "true"}}}}}'

if [ ! -f /usr/local/bin/calicoctl ]; then echo "No calicoctl, install now:" curl -L https://github.com/projectcalico/calico/releases/download/v${CALICO_VERSION}/calicoctl-linux-${ARCH} -o ${WORKDIR}/calicoctl; chmod +x ${WORKDIR}/calicoctl; sudo cp ${WORKDIR}/calicoctl /usr/local/bin; rm ${WORKDIR}/calicoctl fi

echo "Enable eBPF:" calicoctl patch felixconfiguration default --patch='{"spec": {"bpfEnabled": true}}' --allow-version-mismatch

echo "Enable Direct Server Return(DSR) mode: optional" #calicoctl patch felixconfiguration default --patch='{"spec": {"bpfExternalServiceMode": "DSR"}}'

Context

I try to access the VM(10.169.210.139) which was located in a server with Calico eBPF enabled from another server(10.169.242.130), only the first ping packet can be received, and other ping packets were lost.

The conntrack for the Calico node showed the ssh access (from 10.169.242.130) to VM(10.169.210.139): # calico-node -bpf conntrack dump |grep "10.169.210.139" 2022-07-15 08:21:37.276 [INFO][13703] confd/maps.go 433: Loaded map file descriptor. fd=0x7 name="/sys/fs/bpf/tc/globals/cali_v4_ct2" ConntrackKey{proto=6 10.169.242.130:61701 <-> 10.169.210.139:22} -> Entry{Type:0, Created:17278773931441431, LastSeen:17278777015499210, Flags: Data: {A2B:{Seqno:92691206 SynSeen:true AckSeen:true FinSeen:false RstSeen:false Whitelisted:true Opener:true Ifindex:2} B2A:{Seqno:959809259 SynSeen:true AckSeen:true FinSeen:false RstSeen:false Whitelisted:false Opener:false Ifindex:0} OrigDst:0.0.0.0 OrigPort:0 OrigSPort:0 TunIP:0.0.0.0}} Age: 3.143463957s Active ago 59.406178ms ESTABLISHED

Your Environment

  • Calico version: v3.23.2
  • Orchestrator version (e.g. kubernetes, mesos, rkt): K8s 1.22.1
  • Operating System and version: Ubuntu 20.04 focal Linux kernel 5.10.0
  • Link to your project (optional):

TrevorTaoARM avatar Jul 28 '22 03:07 TrevorTaoARM

CC @tomastigera

caseydavenport avatar Aug 01 '22 17:08 caseydavenport

I first met this issue on an arm64 platform, but it seems there is no such issue on some other platforms or systems, e.g, for some x86 systems. I checked the eBPF output log by setting bpfLogLevel to Debug, the output showed the differences between the 2 kinds of cases. We met this issue on an arm64 platform, but it seems there is no such issue on x86 platform. I checked the log output carefully for these 2 systems:

  1. For arm64 platform:
  4869           <idle>-0       [088] d.s. 1810775.267212: bpf_trace_printk: enp9s0---I: New packet at ifindex=2; mark=0
  4870
  4871           <idle>-0       [088] d.s. 1810775.267213: bpf_trace_printk: enp9s0---I: No metadata is shared by XDP
  4872
  4873           <idle>-0       [088] d.s. 1810775.267215: bpf_trace_printk: enp9s0---I: IP id=13695 s=aa9d0e5 d=aa9d287
  4874
  4875           <idle>-0       [088] d.s. 1810775.267217: bpf_trace_printk: enp9s0---I: ICMP; type=8 code=0
  4876
  4877           <idle>-0       [088] d.s. 1810775.267218: bpf_trace_printk: enp9s0---I: CT-1 lookup from aa9d0e5:0
  4878
  4879           <idle>-0       [088] d.s. 1810775.267219: bpf_trace_printk: enp9s0---I: CT-1 lookup to   aa9d287:0
  4880
  4881           <idle>-0       [088] d.s. 1810775.267221: bpf_trace_printk: enp9s0---I: CT-1 Hit! NORMAL entry.
  4882
  4883           <idle>-0       [088] d.s. 1810775.267222: bpf_trace_printk: enp9s0---I: CT-1 result: 0x2003
  4884
  4885           <idle>-0       [088] d.s. 1810775.267223: bpf_trace_printk: enp9s0---I: conntrack entry flags 0x100
  4886
  4887           <idle>-0       [088] d.s. 1810775.267223: bpf_trace_printk: enp9s0---I: CT Hit
  4888
  4889           <idle>-0       [088] d.s. 1810775.267224: bpf_trace_printk: enp9s0---I: Entering calico_tc_skb_accepted_entrypoint
  4890
  4891           <idle>-0       [088] d.s. 1810775.267226: bpf_trace_printk: enp9s0---I: IP id=13695 s=aa9d0e5 d=aa9d287
  4892
  4893           <idle>-0       [088] d.s. 1810775.267226: bpf_trace_printk: enp9s0---I: Entering calico_tc_skb_accepted
  4894
  4895           <idle>-0       [088] d.s. 1810775.267227: bpf_trace_printk: enp9s0---I: src=aa9d0e5 dst=aa9d287
  4896
  4897           <idle>-0       [088] d.s. 1810775.267228: bpf_trace_printk: enp9s0---I: post_nat=0:0
  4898
  4899           <idle>-0       [088] d.s. 1810775.267228: bpf_trace_printk: enp9s0---I: tun_ip=0
  4900
  4901           <idle>-0       [088] d.s. 1810775.267229: bpf_trace_printk: enp9s0---I: pol_rc=1
  4902
  4903           <idle>-0       [088] d.s. 1810775.267230: bpf_trace_printk: enp9s0---I: sport=0
  4904
  4905           <idle>-0       [088] d.s. 1810775.267230: bpf_trace_printk: enp9s0---I: flags=20
  4906
  4907           <idle>-0       [088] d.s. 1810775.267231: bpf_trace_printk: enp9s0---I: ct_rc=3
  4908
  4909           <idle>-0       [088] d.s. 1810775.267231: bpf_trace_printk: enp9s0---I: ct_related=0
  4910
  4911           <idle>-0       [088] d.s. 1810775.267232: bpf_trace_printk: enp9s0---I: mark=0x1000000
  4912 4912
  4913           <idle>-0       [088] d.s. 1810775.267233: bpf_trace_printk: enp9s0---I: ip->ttl 64
  4914
  4915           <idle>-0       [088] d.s. 1810775.267234: bpf_trace_printk: enp9s0---I: marking enp9_SKB_MARK_BYPASS
  4916
  4917           <idle>-0       [088] d.s. 1810775.267235: bpf_trace_printk: enp9s0---I: IP id=13695 s=aa9d0e5 d=aa9d287
  4918
  4919           <idle>-0       [088] d.s. 1810775.267235: bpf_trace_printk: enp9s0---I: FIB family=2
  4920
  4921           <idle>-0       [088] d.s. 1810775.267236: bpf_trace_printk: enp9s0---I: FIB tot_len=0
  4922
  4923           <idle>-0       [088] d.s. 1810775.267237: bpf_trace_printk: enp9s0---I: FIB ifindex=2
  4924
  4925           <idle>-0       [088] d.s. 1810775.267237: bpf_trace_printk: enp9s0---I: FIB l4_protocol=1
  4926
  4927           <idle>-0       [088] d.s. 1810775.267238: bpf_trace_printk: enp9s0---I: FIB sport=0
  4928
  4929           <idle>-0       [088] d.s. 1810775.267238: bpf_trace_printk: enp9s0---I: FIB dport=0
  4930
  4931           <idle>-0       [088] d.s. 1810775.267239: bpf_trace_printk: enp9s0---I: FIB ipv4_src=aa9d0e5
  4932
  4933           <idle>-0       [088] d.s. 1810775.267240: bpf_trace_printk: enp9s0---I: FIB ipv4_dst=aa9d287
  4934
  4935           <idle>-0       [088] d.s. 1810775.267240: bpf_trace_printk: enp9s0---I: Traffic is towards the host namespace, doing Linux FIB lookup
  4936
  4937           <idle>-0       [088] d.s. 1810775.267243: bpf_trace_printk: enp9s0---I: FIB lookup succeeded - with neigh
  4938
  4939           <idle>-0       [088] d.s. 1810775.267244: bpf_trace_printk: enp9s0---I: Got Linux FIB hit, redirecting to iface 2.
  4940
  4941           <idle>-0       [088] d.s. 1810775.267245: bpf_trace_printk: enp9s0---I: Traffic is towards host namespace, marking with 0x3000000.
  4942
  4943           <idle>-0       [088] d.s. 1810775.267247: bpf_trace_printk: enp9s0---I: Final result=ALLOW (0). Program execution time: 31307ns
  4944
  4945           <idle>-0       [088] d.s. 1810775.267249: bpf_trace_printk: enp9s0---E: New packet at ifindex=2; mark=3000000
  4946
  4947           <idle>-0       [088] d.s. 1810775.267250: bpf_trace_printk: enp9s0---E: Final result=ALLOW (3). Bypass mark bit set.
  4948

For other systems(x86 currently), the log showed:

      <idle>-0       [014] ..s. 17619198.981271: 0: eno1np0--I: New packet at ifindex=2; mark=0
      <idle>-0       [014] ..s. 17619198.981271: 0: eno1np0--I: No metadata is shared by XDP
      <idle>-0       [014] ..s. 17619198.981272: 0: eno1np0--I: IP id=53367 s=aa9d0e5 d=aa9d27f
      <idle>-0       [014] ..s. 17619198.981273: 0: eno1np0--I: ICMP; type=8 code=0
      <idle>-0       [014] ..s. 17619198.981273: 0: eno1np0--I: CT-1 lookup from aa9d0e5:0
      <idle>-0       [014] ..s. 17619198.981274: 0: eno1np0--I: CT-1 lookup to   aa9d27f:0
      <idle>-0       [014] ..s. 17619198.981275: 0: eno1np0--I: CT-1 Hit! NORMAL entry.
      <idle>-0       [014] ..s. 17619198.981275: 0: eno1np0--I: CT-1 result: 0x2
      <idle>-0       [014] ..s. 17619198.981276: 0: eno1np0--I: conntrack entry flags 0x100
      <idle>-0       [014] ..s. 17619198.981276: 0: eno1np0--I: CT Hit
      <idle>-0       [014] ..s. 17619198.981277: 0: eno1np0--I: Entering calico_tc_skb_accepted_entrypoint
      <idle>-0       [014] ..s. 17619198.981277: 0: eno1np0--I: IP id=53367 s=aa9d0e5 d=aa9d27f
      <idle>-0       [014] ..s. 17619198.981278: 0: eno1np0--I: Entering calico_tc_skb_accepted
      <idle>-0       [014] ..s. 17619198.981278: 0: eno1np0--I: src=aa9d0e5 dst=aa9d27f
      <idle>-0       [014] ..s. 17619198.981279: 0: eno1np0--I: post_nat=0:0
      <idle>-0       [014] ..s. 17619198.981279: 0: eno1np0--I: tun_ip=0
      <idle>-0       [014] ..s. 17619198.981279: 0: eno1np0--I: pol_rc=1
      <idle>-0       [014] ..s. 17619198.981280: 0: eno1np0--I: sport=0
      <idle>-0       [014] ..s. 17619198.981280: 0: eno1np0--I: flags=20
      <idle>-0       [014] ..s. 17619198.981280: 0: eno1np0--I: ct_rc=2
      <idle>-0       [014] ..s. 17619198.981281: 0: eno1np0--I: ct_related=0
      <idle>-0       [014] ..s. 17619198.981281: 0: eno1np0--I: mark=0x1000000
      <idle>-0       [014] ..s. 17619198.981281: 0: eno1np0--I: ip->ttl 64
      <idle>-0       [014] ..s. 17619198.981282: 0: eno1np0--I: IP id=53367 s=aa9d0e5 d=aa9d27f
      <idle>-0       [014] ..s. 17619198.981283: 0: eno1np0--I: FIB family=2
      <idle>-0       [014] ..s. 17619198.981283: 0: eno1np0--I: FIB tot_len=0
      <idle>-0       [014] ..s. 17619198.981283: 0: eno1np0--I: FIB ifindex=2
      <idle>-0       [014] ..s. 17619198.981283: 0: eno1np0--I: FIB l4_protocol=1
      <idle>-0       [014] ..s. 17619198.981284: 0: eno1np0--I: FIB sport=0
      <idle>-0       [014] ..s. 17619198.981284: 0: eno1np0--I: FIB dport=0
      <idle>-0       [014] ..s. 17619198.981284: 0: eno1np0--I: FIB ipv4_src=aa9d0e5
      <idle>-0       [014] ..s. 17619198.981284: 0: eno1np0--I: FIB ipv4_dst=aa9d27f
      <idle>-0       [014] ..s. 17619198.981285: 0: eno1np0--I: Traffic is towards the host namespace, doing Linux FIB lookup
      <idle>-0       [014] ..s. 17619198.981287: 0: eno1np0--I: FIB lookup failed (FIB problem): 7.
      <idle>-0       [014] ..s. 17619198.981287: 0: eno1np0--I: Traffic is towards host namespace, marking with 0x1000000.
      <idle>-0       [014] ..s. 17619198.981288: 0: eno1np0--I: Final result=ALLOW (0). Program execution time: 16040ns
       vhost-3084463-3084499 [008] .... 17619198.981418: 0: eno1np0--E: New packet at ifindex=2; mark=0
       vhost-3084463-3084499 [008] .... 17619198.981419: 0: eno1np0--E: IP id=42046 s=aa9d27f d=aa9d0e5

The test process is the same for 2 systems: we just ping a VM in a host which had enabled Calico/ebpf dataplane from another host. For arm64 platform, the ping packet can't reach the VM since it had been falsely forwarded by the eBPF program (forward_or_drop function). The differences here lies on the result of FIB lookup, for x86 platform, the FIB lookup failed with code 7, then marked with 0x1000000; for arm64 platform, the FIB lookup succeeded with neigh given, then marked with 0x3000000 and re-appeared on the egress direction of the same interface.

I think for the packet destined for VMs instead of the host itself, it should be checked if it's actually for the host itself by checking the eBPF route map first. If the lookup result for route is unknown, it should be thought as NOT destined for this host and to be ok(TC_ACT_OK) to skip subsequent eBPF processing here.

I saw there is a similar processing for the unrelevant traffic in Cilium eBPF implementation: ep = lookup_ip4_endpoint(ip4); https://github.com/cilium/cilium/blob/master/bpf/bpf_host.c#L571

and if (!from_host) return CTX_ACT_OK; https://github.com/cilium/cilium/blob/master/bpf/bpf_host.c#L586

Here the endpoint of Cilium eBPF is similar to the route of Calico eBPF.

I will put up a PR to address this issue and thanks for your review.

The used versions of Calico: v3.23.2, v3.24.1 and v3.25.0-0.dev.

TrevorTaoARM avatar Sep 06 '22 03:09 TrevorTaoARM

@tomastigera @mazdakn could you guys please take a look?

lmm avatar Sep 06 '22 16:09 lmm

@TrevorTaoARM sorry for not responding sooner, totally missed this, :eyes: now! And thanks for a great analysis! :pray:

tomastigera avatar Oct 18 '22 16:10 tomastigera

@TrevorTaoARM I commented at your patch :arrow_up:

tomastigera avatar Oct 18 '22 17:10 tomastigera

The differences here lies on the result of FIB lookup, for x86 platform, the FIB lookup failed with code 7, then marked with 0x1000000; for arm64 platform, the FIB lookup succeeded with neigh given, then marked with 0x3000000 and re-appeared on the egress direction of the same interface.

It seems like the packets ultimately ended up on the egress of the same device regardless of whether the FIB failed or not. But I am not quite sure how the packet looks like in the ARM case as that is missing in the logs when the BYPASS mark is set. Perhaps the host mangled that packet?

tomastigera avatar Oct 20 '22 18:10 tomastigera

The differences here lies on the result of FIB lookup, for x86 platform, the FIB lookup failed with code 7, then marked with 0x1000000; for arm64 platform, the FIB lookup succeeded with neigh given, then marked with 0x3000000 and re-appeared on the egress direction of the same interface.

It seems like the packets ultimately ended up on the egress of the same device regardless of whether the FIB failed or not. But I am not quite sure how the packet looks like in the ARM case as that is missing in the logs when the BYPASS mark is set. Perhaps the host mangled that packet?

The differences here lies on the result of FIB lookup, for x86 platform, the FIB lookup failed with code 7, then marked with 0x1000000; for arm64 platform, the FIB lookup succeeded with neigh given, then marked with 0x3000000 and re-appeared on the egress direction of the same interface.

It seems like the packets ultimately ended up on the egress of the same device regardless of whether the FIB failed or not. But I am not quite sure how the packet looks like in the ARM case as that is missing in the logs when the BYPASS mark is set. Perhaps the host mangled that packet?

@tomastigera Yes, the difference of fib lookup results between the 2 platforms really confused me. But it looks like only when eBPF is enabled, the packet flow for a certain VM would be blocked. I didn't know when the BYPASS mark is set, what the subsequent data path for the packet is. The only trace I saw was: 4945 -0 [088] d.s. 1810775.267249: bpf_trace_printk: enp9s0---E: New packet at ifindex=2; mark=3000000 4946 4947 -0 [088] d.s. 1810775.267250: bpf_trace_printk: enp9s0---E: Final result=ALLOW (3). Bypass mark bit set.

which showed the packet had been transfered to the egress direction, but for x86, the packet is still in the ingress direction: -0 [014] ..s. 17619198.981287: 0: eno1np0--I: Traffic is towards host namespace, marking with 0x1000000. -0 [014] ..s. 17619198.981288: 0: eno1np0--I: Final result=ALLOW (0). Program execution time: 16040ns

TrevorTaoARM avatar Oct 24 '22 06:10 TrevorTaoARM

@tomastigera Fixed but not complete version v3.25.0-0.dev-490-g3b818a2f1494 schema eth0(without IP) ---- bond0(10.208.201.15/24) ---- app(port 2200)

-0       [005] dNs3.   220.318140: bpf_trace_printk: eth0-----I: New packet at ifindex=2; mark=0           -0       [005] dNs3.   220.318151: bpf_trace_printk: eth0-----I: No metadata is shared by XDP           -0       [005] dNs3.   220.318152: bpf_trace_printk: eth0-----I: IP id=0 s=a97d428 d=ad0c90f           -0       [005] dNs3.   220.318153: bpf_trace_printk: eth0-----I: TCP; ports: s=50634 d=2200           -0       [005] dNs3.   220.318153: bpf_trace_printk: eth0-----I: CT-6 lookup from a97d428:50634           -0       [005] dNs3.   220.318154: bpf_trace_printk: eth0-----I: CT-6 lookup to   ad0c90f:2200           -0       [005] dNs3.   220.318155: bpf_trace_printk: eth0-----I: CT-6 Miss for TCP SYN, NEW flow.           -0       [005] dNs3.   220.318156: bpf_trace_printk: eth0-----I: CT-6 result: NEW.           -0       [005] dNs3.   220.318156: bpf_trace_printk: eth0-----I: conntrack entry flags 0x0           -0       [005] dNs3.   220.318157: bpf_trace_printk: eth0-----I: NAT: 1st level lookup addr=ad0c90f port=2200 protocol=6.           -0       [005] dNs3.   220.318158: bpf_trace_printk: eth0-----I: NAT: Miss.           -0       [005] dNs3.   220.318160: bpf_trace_printk: eth0-----I: Host RPF check src=a97d428 skb iface=2 strict if 3           -0       [005] dNs3.   220.318161: bpf_trace_printk: eth0-----I: Host RPF check src=a97d428 skb iface=2 fib rc 0           -0       [005] dNs3.   220.318161: bpf_trace_printk: eth0-----I: Host RPF check src=a97d428 skb iface=2 result 0           -0       [005] dNs3.   220.318162: bpf_trace_printk: eth0-----I: Final result=DENY (0). Program execution time: 10037ns

dropped by RPF check with BPFEnforceRPF=Disabled

-0 [005] d.s3. 6710.121268: bpf_trace_printk: eth0-----I: TCP; ports: s=52905 d=2200 -0 [005] d.s3. 6710.121269: bpf_trace_printk: eth0-----I: CT-6 lookup from a97d428:52905 -0 [005] d.s3. 6710.121270: bpf_trace_printk: eth0-----I: CT-6 lookup to ad0c90f:2200 -0 [005] d.s3. 6710.121271: bpf_trace_printk: eth0-----I: CT-6 Miss for TCP SYN, NEW flow. -0 [005] d.s3. 6710.121274: bpf_trace_printk: eth0-----I: CT-6 result: NEW. -0 [005] d.s3. 6710.121275: bpf_trace_printk: eth0-----I: conntrack entry flags 0x0 -0 [005] d.s3. 6710.121277: bpf_trace_printk: eth0-----I: NAT: 1st level lookup addr=ad0c90f port=2200 protocol=6. -0 [005] d.s3. 6710.121280: bpf_trace_printk: eth0-----I: NAT: Miss. -0 [005] d.s3. 6710.121282: bpf_trace_printk: eth0-----I: Host RPF check disabled -0 [005] d.s3. 6710.121284: bpf_trace_printk: eth0-----I: Post-NAT dest IP is local host. -0 [005] d.s3. 6710.121285: bpf_trace_printk: eth0-----I: About to jump to policy program. -0 [005] d.s3. 6710.121285: bpf_trace_printk: eth0-----I: HEP with no policy, allow. -0 [005] d.s3. 6710.121287: bpf_trace_printk: eth0-----I: Entering calico_tc_skb_accepted_entrypoint -0 [005] d.s3. 6710.121288: bpf_trace_printk: eth0-----I: Entering calico_tc_skb_accepted -0 [005] d.s3. 6710.121289: bpf_trace_printk: eth0-----I: src=a97d428 dst=ad0c90f -0 [005] d.s3. 6710.121290: bpf_trace_printk: eth0-----I: post_nat=ad0c90f:2200 -0 [005] d.s3. 6710.121291: bpf_trace_printk: eth0-----I: tun_ip=0 -0 [005] d.s3. 6710.121297: bpf_trace_printk: eth0-----I: pol_rc=1 -0 [005] d.s3. 6710.121298: bpf_trace_printk: eth0-----I: sport=52905 -0 [005] d.s3. 6710.121299: bpf_trace_printk: eth0-----I: flags=24 -0 [005] d.s3. 6710.121300: bpf_trace_printk: eth0-----I: ct_rc=0 -0 [005] d.s3. 6710.121301: bpf_trace_printk: eth0-----I: ct_related=0 -0 [005] d.s3. 6710.121302: bpf_trace_printk: eth0-----I: mark=0x1000000 -0 [005] d.s3. 6710.121304: bpf_trace_printk: eth0-----I: ip->ttl 57 -0 [005] d.s3. 6710.121307: bpf_trace_printk: eth0-----I: Allowed by policy: ACCEPT

Dimonyga avatar Nov 23 '22 09:11 Dimonyga

@Dimonyga Not sure whether this is related to the original issue, however, if you apply bpf programs to eth0 in this setup, then surely you cannot pass a strict RPF because routing says that the return path is via bond0 and not eth0. So the bpfDataIfacePattern must not include eth0 and must include bond0 Note that is also much more logically correct. However, there is an issue that if you change the pattern, programs from eth0 are not cleared. You can either remove them manually or reboot the nodes. This issue is addressed by https://github.com/projectcalico/calico/pull/7008

tomastigera avatar Nov 23 '22 20:11 tomastigera

sorry my mistakes The task sounded a little different eth0(no IP) ---- bond0(SUBNET1) --- bond0.208@bond0(SUBNET2)---- application(port 2200)

When we start calico-node with bpfdataifacepattern:^(bond.*|tunl0$|wireguard.cali$|vxlan.calico$) Access to SUBNET2 denied I am passing bpfenforcerpf:Disabled parameter And access restored. In this case, in the debug output, all packets that should be placed in bond0.208 are dropped at the bond0 level. Suggestion to skip packages with vlanid!=0

Dimonyga avatar Dec 06 '22 13:12 Dimonyga