amazon-vpc-cni-k8s
                                
                                 amazon-vpc-cni-k8s copied to clipboard
                                
                                    amazon-vpc-cni-k8s copied to clipboard
                            
                            
                            
                        Add missing rules when NodePort support is disabled
What type of PR is this?
bug
Which issue does this PR fix:
#2025
What does this PR do / Why do we need it:
- the rules that need to be installed for NodePort support and SNAT support are very similar. The same traffic mark is needed for both. As a result, rules that are currently installed only when NodePort support is enabled should also be installed when external SNAT is disabled, which is the case by default.
- remove "-m state --state NEW" from a rule in the nat table. This is always true for packets that traverse the nat table.
- fix typo in one rule's name (extra whitespace).
Testing done on this change:
- Unit testing
- Tested manually by disabling NodePort support and adding the missing rule.
- Tested with custom container image for amazon-k8s-cniandamazon-k8s-cni-init
Automation added to e2e:
Will this PR introduce any new dependencies?: No
Will this break upgrades or downgrades. Has updating a running cluster been tested?: Should not break upgrades or downgrade (may have some superfluous iptables rules). Not tested.
Does this change require updates to the CNI daemonset config files to work?: No
Does this PR introduce any user-facing change?: No, bug fix
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Fixes #2025
Co-authored-by: Quan Tian [email protected]
Signed-off-by: Antonin Bas [email protected]
I am happy to do more testing, once maintainers validate the approach
@antoninbas, thanks for the fix. Could you test the following?
- upgrade from previous v1.10 to the one with your fix, verify nat PREROUTING rules are configured as expected
@kishorj I did 2 upgrade tests, one with NodePort support enabled, and one with NodePort support disabled. Full iptables rules for nat and mangle tables are below.
There is only one issue. Because I removed the unnecessary -m state --state NEW matcher from a PREROUTING rule, the rule is now duplicated, and will be until the next Node restart:
# old rule
-A PREROUTING -i eni+ -m comment --comment "AWS, outbound connections" -m state --state NEW -j AWS-CONNMARK-CHAIN-0
# new rule
-A PREROUTING -i eni+ -m comment --comment "AWS, outbound connections" -j AWS-CONNMARK-CHAIN-0
I see 3 solutions, let me see which one you want:
- add some logic to remove the old rule
- rollback this change in my patch and keep the unnecessary matcher; it won't impact the correctness of the patch and the improved rule was just a "nice-to-have"
- keep the patch as it is, since the duplicated rule doesn't impact correctness. On new K8s clusters and after a Node restart, there will be no duplicates.
Full iptables rules
NodePort support enabled (default)
Default build (v1.10.1-eksbuild.1)
# nat
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -i eni+ -m comment --comment "AWS, outbound connections" -m state --state NEW -j AWS-CONNMARK-CHAIN-0
-A PREROUTING -m comment --comment "AWS, CONNMARK" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -m comment --comment "AWS SNAT CHAIN" -j AWS-SNAT-CHAIN-0
-A AWS-CONNMARK-CHAIN-0 ! -d 192.168.0.0/16 -m comment --comment "AWS CONNMARK CHAIN, VPC CIDR" -j AWS-CONNMARK-CHAIN-1
-A AWS-CONNMARK-CHAIN-1 -m comment --comment "AWS, CONNMARK" -j CONNMARK --set-xmark 0x80/0x80
-A AWS-SNAT-CHAIN-0 ! -d 192.168.0.0/16 -m comment --comment "AWS SNAT CHAIN" -j AWS-SNAT-CHAIN-1
-A AWS-SNAT-CHAIN-1 ! -o vlan+ -m comment --comment "AWS, SNAT" -m addrtype ! --dst-type LOCAL -j SNAT --to-source 192.168.14.191 --random-fully
# mangle 
-A PREROUTING -i eth0 -m comment --comment "AWS, primary ENI" -m addrtype --dst-type LOCAL --limit-iface-in -j CONNMARK --set-xmark 0x80/0x80
-A PREROUTING -i eni+ -m comment --comment "AWS, primary ENI" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
-A PREROUTING -i vlan+ -m comment --comment "AWS, primary ENI" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
After upgrade to patched build
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -i eni+ -m comment --comment "AWS, outbound connections" -m state --state NEW -j AWS-CONNMARK-CHAIN-0
-A PREROUTING -i eni+ -m comment --comment "AWS, outbound connections" -j AWS-CONNMARK-CHAIN-0
-A PREROUTING -m comment --comment "AWS, CONNMARK" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -m comment --comment "AWS SNAT CHAIN" -j AWS-SNAT-CHAIN-0
-A AWS-CONNMARK-CHAIN-0 ! -d 192.168.0.0/16 -m comment --comment "AWS CONNMARK CHAIN, VPC CIDR" -j AWS-CONNMARK-CHAIN-1
-A AWS-CONNMARK-CHAIN-1 -m comment --comment "AWS, CONNMARK" -j CONNMARK --set-xmark 0x80/0x80
-A AWS-SNAT-CHAIN-0 ! -d 192.168.0.0/16 -m comment --comment "AWS SNAT CHAIN" -j AWS-SNAT-CHAIN-1
-A AWS-SNAT-CHAIN-1 ! -o vlan+ -m comment --comment "AWS, SNAT" -m addrtype ! --dst-type LOCAL -j SNAT --to-source 192.168.14.191 --random-fully
# mangle
-A PREROUTING -i eth0 -m comment --comment "AWS, primary ENI" -m addrtype --dst-type LOCAL --limit-iface-in -j CONNMARK --set-xmark 0x80/0x80
-A PREROUTING -i eni+ -m comment --comment "AWS, primary ENI" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
-A PREROUTING -i vlan+ -m comment --comment "AWS, primary ENI" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
NodePort support disabled
Default build (v1.10.1-eksbuild.1)
# nat
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -i eni+ -m comment --comment "AWS, outbound connections" -m state --state NEW -j AWS-CONNMARK-CHAIN-0
-A PREROUTING -m comment --comment "AWS, CONNMARK" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -m comment --comment "AWS SNAT CHAIN" -j AWS-SNAT-CHAIN-0
-A AWS-CONNMARK-CHAIN-0 ! -d 192.168.0.0/16 -m comment --comment "AWS CONNMARK CHAIN, VPC CIDR" -j AWS-CONNMARK-CHAIN-1
-A AWS-CONNMARK-CHAIN-1 -m comment --comment "AWS, CONNMARK" -j CONNMARK --set-xmark 0x80/0x80
-A AWS-SNAT-CHAIN-0 ! -d 192.168.0.0/16 -m comment --comment "AWS SNAT CHAIN" -j AWS-SNAT-CHAIN-1
-A AWS-SNAT-CHAIN-1 ! -o vlan+ -m comment --comment "AWS, SNAT" -m addrtype ! --dst-type LOCAL -j SNAT --to-source 192.168.14.191 --random-fully
# mangle
# empty
After upgrade to patched build
# nat
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -i eni+ -m comment --comment "AWS, outbound connections" -m state --state NEW -j AWS-CONNMARK-CHAIN-0
-A PREROUTING -i eni+ -m comment --comment "AWS, outbound connections" -j AWS-CONNMARK-CHAIN-0
-A PREROUTING -m comment --comment "AWS, CONNMARK" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -m comment --comment "AWS SNAT CHAIN" -j AWS-SNAT-CHAIN-0
-A AWS-CONNMARK-CHAIN-0 ! -d 192.168.0.0/16 -m comment --comment "AWS CONNMARK CHAIN, VPC CIDR" -j AWS-CONNMARK-CHAIN-1
-A AWS-CONNMARK-CHAIN-1 -m comment --comment "AWS, CONNMARK" -j CONNMARK --set-xmark 0x80/0x80
-A AWS-SNAT-CHAIN-0 ! -d 192.168.0.0/16 -m comment --comment "AWS SNAT CHAIN" -j AWS-SNAT-CHAIN-1
-A AWS-SNAT-CHAIN-1 ! -o vlan+ -m comment --comment "AWS, SNAT" -m addrtype ! --dst-type LOCAL -j SNAT --to-source 192.168.14.191 --random-fully
# mangle
-A PREROUTING -i eni+ -m comment --comment "AWS, primary ENI" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
@kishorj friendly ping for this
3. keep the patch as it is, since the duplicated rule doesn't impact correctness. On new K8s clusters and after a Node restart, there will be no duplicates.
This option is acceptable to me
Option 1 is cleaner, it removes the old rule.
networking.go:703 has the following log line -  log.Debugf("host network setup: found potentially stale SNAT rule for chain %s: %v", chain, ruleSpec).
I greatly appreciate if you can modify it as follows:
log.Debugf("host network setup: found potentially stale rule for chain %s: %v", chain, ruleSpec)
@kishorj I will update the PR with all your feedback some time this week
@kishorj I updated this PR with the changes you requested (it's all in the most recent commit)
I also ran the requested tests on an EKS cluster.
- start with old DS
Here are the PREROUTING chains for the nat table:
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -i eni+ -m comment --comment "AWS, outbound connections" -m state --state NEW -j AWS-CONNMARK-CHAIN-0
-A PREROUTING -m comment --comment "AWS, CONNMARK" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
- update to the new DS
Here are the PREROUTING chains for the nat table:
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -i eni+ -m comment --comment "AWS, outbound connections" -j AWS-CONNMARK-CHAIN-0
-A PREROUTING -m comment --comment "AWS, CONNMARK" -j CONNMARK --restore-mark --nfmask 0x80 --ctmask 0x80
The rule has been updated correctly.
- restart the DS Pod and check the logs
$ kubectl -n kube-system logs pod/aws-node-b88wv -c aws-node
{"level":"info","ts":"2022-08-19T22:36:08.333Z","caller":"entrypoint.sh","msg":"Validating env variables ..."}
{"level":"info","ts":"2022-08-19T22:36:08.334Z","caller":"entrypoint.sh","msg":"Install CNI binaries.."}
{"level":"info","ts":"2022-08-19T22:36:08.350Z","caller":"entrypoint.sh","msg":"Starting IPAM daemon in the background ... "}
{"level":"info","ts":"2022-08-19T22:36:08.352Z","caller":"entrypoint.sh","msg":"Checking for IPAM connectivity ... "}
{"level":"info","ts":"2022-08-19T22:36:10.363Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-08-19T22:36:12.370Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2022-08-19T22:36:12.391Z","caller":"entrypoint.sh","msg":"Copying config file ... "}
{"level":"info","ts":"2022-08-19T22:36:12.395Z","caller":"entrypoint.sh","msg":"Successfully copied CNI plugin binary and config file."}
{"level":"info","ts":"2022-08-19T22:36:12.397Z","caller":"entrypoint.sh","msg":"Foregrounding IPAM daemon ..."}
$ kubectl -n kube-system logs pod/aws-node-b88wv -c aws-vpc-cni-init
Copying CNI plugin binaries ...
Configure rp_filter loose...
net.ipv4.conf.eth0.rp_filter = 2
2
net.ipv4.tcp_early_demux = 1
CNI init container done
/lgtm
@kishorj I see that the CI tests are not running. I think you may have to approve that on your side since I am a first time contributor to this repo.
@antoninbas - We have disabled the CI tests for now. I will discuss with @kishorj and see if we can run it manually before merge.
@antoninbas - We have disabled the CI tests for now. I will discuss with @kishorj and see if we can run it manually before merge.
Ok. For what it's worth, I have run the unit tests locally already (with make docker-unit-tests).
@antoninbas - We have disabled the CI tests for now. I will discuss with @kishorj and see if we can run it manually before merge.
Ok. For what it's worth, I have run the unit tests locally already (with
make docker-unit-tests).
Thank you :) But we will have to also run the integration tests. I will see if we can manually trigger it for now :)
Sorry for the delay, we had disabled the GitHub runners. Now it is up and running, I have submitted for integration tests - https://github.com/aws/amazon-vpc-cni-k8s/actions/runs/3139125467
/cc @kishorj
2 attempts of Upstream conformance tests are failing - https://github.com/aws/amazon-vpc-cni-k8s/actions/runs/3139125467. I can look into it next week.
I ran the conformance tests locally (after updating the branch), and can confirm that all the tests were passing. I will re-run the tests in Github.
git log
commit 85e2ef5ac82dfd3f79f2a5b05ec52f485de5a0c0 (HEAD -> add-missing-rules-when-node-port-support-is-disabled)
Ran 19 of 5771 Specs in 1065.502 seconds
SUCCESS! -- 19 Passed | 0 Failed | 0 Flaked | 0 Pending | 5752 Skipped
PASS
Integration tests in progress https://github.com/aws/amazon-vpc-cni-k8s/actions/runs/3364855441/jobs/5579689661
The latest run of the integration tests have passed: https://github.com/aws/amazon-vpc-cni-k8s/actions/runs/3364855441/jobs/5582884522