calico icon indicating copy to clipboard operation
calico copied to clipboard

Pod in nat-outgoing should not be SNATed when it accesses local cluster

Open wayne-cheng opened this issue 1 year ago • 6 comments

Description

When natOutgoing is enabled in a IPPool, packets sent from Calico networked containers in this IP pool to destinations will be SNATed(msaqueraded).

However, we would prefer that the traffic accessing local cluster hosts should not be msaqueraded.

Provide a enum setting natOutgoingExclusions(default: IPPoolsOnly) in felix config.

Configure which type of destinations is excluded from being masqueraded.

  • IPPoolsOnly: destinations outside of this IP pool will be masqueraded.
  • IPPoolsAndHostIPs: destinations outside of this IP pool and all hosts will be masqueraded.

To maintain compatibility, this setting is not unconditionally enabled. We can consider switching its default value to IPPoolsAndHostIPs in the future version.

Related issues/PRs

fixes linux part of #8960

Todos

  • [ ] Tests
  • [ ] Documentation
  • [ ] Release note

Release Note

New Felix config param natOutgoingExclusions allows for configuring which type of destinations is excluded from being masqueraded.

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

wayne-cheng avatar Jun 29 '24 10:06 wayne-cheng

/sem-approve

coutinhop avatar Jul 04 '24 16:07 coutinhop

I think this is achievable already today by creating a disabled IP pool containing your host IP range which tells Calico these IPs aren't external to the cluster.

This is obviously not ideal for clusters where the host IPs might not fall cleanly into a small set of IP pools.

I do think a setting like this makes a bit more sense on FelixConfiguration than on IP pools - setting this for a single IP pool (at least based on this implementation) is going to enforce the setting on every IP pool with NATOutgoing set, so the config model is a little bit misleading.

@fasaxc WDYT?

caseydavenport avatar Jul 08 '24 22:07 caseydavenport

This is likely going to work for eBPF mode as well, however, it would be better to propagate the setting to somewhere here

tomastigera avatar Oct 08 '24 22:10 tomastigera

This is likely going to work for eBPF mode as well, however, it would be better to propagate the setting to somewhere here

In the related issue, I mentioned that Calico for Windows also needs such a setting. I think we can continue to improve it through additional PRs.

wayne-cheng avatar Oct 09 '24 02:10 wayne-cheng

Could we make this an enum; say:

NATOutgoingExclusions: IPPoolsOnly | IPPoolsAndHostSubnet 

then we can improve over time with new, better options. For example, we could match on the all-hosts IP set to offer IPPoolsAndHostIPs.

fasaxc avatar Oct 09 '24 08:10 fasaxc

Hi, I updated the code and description. PTAL, thanks! @fasaxc

wayne-cheng avatar Oct 17 '24 12:10 wayne-cheng

@wayne-cheng sorry for completely dropping ball on this PR, I honestly thought that we merged it. I rebased your code here https://github.com/projectcalico/calico/pull/10275 but you are most welcome to rebase your PR if you want to drive it to the finish line (all the yaml conflicts get resolved by running make generate after the api conflicts are resolved)

Any more comments @fasaxc ?

tomastigera avatar Apr 22 '25 21:04 tomastigera

OK, please retest. @tomastigera

wayne-cheng avatar Apr 23 '25 03:04 wayne-cheng

/sem-approve

tomastigera avatar Apr 23 '25 16:04 tomastigera

could you git rm manifests/ocp/crd.projectcalico.org_felixconfigurations.yaml we no longer have that file

tomastigera avatar Apr 23 '25 16:04 tomastigera

could you git rm manifests/ocp/crd.projectcalico.org_felixconfigurations.yaml we no longer have that file

@tomastigera Okay, it’s likely that the command make generate was executed when the code was incorrect. I’ve removed it now.

wayne-cheng avatar Apr 23 '25 23:04 wayne-cheng

/sem-approve

tomastigera avatar Apr 23 '25 23:04 tomastigera

One silly issue, you need to change this value to 162 :man_facepalming: https://github.com/wayne-cheng/calico/blob/fix-iptables-nat-host/libcalico-go/lib/backend/syncersv1/updateprocessors/configurationprocessor_test.go#L46

tomastigera avatar Apr 24 '25 17:04 tomastigera

One silly issue, you need to change this value to 162 🤦‍♂️ https://github.com/wayne-cheng/calico/blob/fix-iptables-nat-host/libcalico-go/lib/backend/syncersv1/updateprocessors/configurationprocessor_test.go#L46

@tomastigera OK, please re-run the CI.

wayne-cheng avatar Apr 24 '25 22:04 wayne-cheng

/sem-approve

tomastigera avatar Apr 25 '25 17:04 tomastigera

@wayne-cheng Thanks for the contribution!

tomastigera avatar May 02 '25 18:05 tomastigera