cloudstack icon indicating copy to clipboard operation
cloudstack copied to clipboard

fix nat table by getting the fitting device for an address

Open DaanHoogland opened this issue 1 year ago • 12 comments

Description

This PR...

Fixes: #9473

Types of changes

  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] New feature (non-breaking change which adds functionality)
  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] Enhancement (improves an existing feature and functionality)
  • [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
  • [ ] build/CI
  • [ ] test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • [ ] Major
  • [x] Minor

Bug Severity

  • [ ] BLOCKER
  • [ ] Critical
  • [ ] Major
  • [ ] Minor
  • [ ] Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

DaanHoogland avatar Aug 20 '24 08:08 DaanHoogland

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 15.08%. Comparing base (6e6a276) to head (20c4e4b). :warning: Report is 326 commits behind head on 4.19.

Additional details and impacted files
@@             Coverage Diff              @@
##               4.19    #9552      +/-   ##
============================================
- Coverage     15.08%   15.08%   -0.01%     
- Complexity    11184    11185       +1     
============================================
  Files          5406     5406              
  Lines        472889   472915      +26     
  Branches      57738    57661      -77     
============================================
+ Hits          71352    71354       +2     
- Misses       393593   393617      +24     
  Partials       7944     7944              
Flag Coverage Δ
uitests 4.30% <ø> (ø)
unittests 15.80% <ø> (-0.01%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar Aug 20 '24 08:08 codecov[bot]

@DaanHoogland I had a look at issue https://github.com/apache/cloudstack/issues/8562 which has been fixed by #8599

Assume there are two public IPs in the VPC VR (and isolated network VR):

  • xx.xx.64.x (source nat, default public IP), on eth1
  • xx.xx.96.x (additional public ip range). on ethX

I think the expected behaviour should be

  • all vms (without Static Nat) has the source Ip xx.xx.64.x (this is current behaviour)
  • the VR should be able to connect to xx.xx.96.x network with source ip xx.xx.96.x (otherwise the gateway check may fail, see #9473)
  • the VMs should be able to connect to xx.xx.96.x network with source ip xx.xx.64.x or xx.xx.96.x (to be discussed)

currently the rules are

-A POSTROUTING -j SNAT -o eth1 --to-source xx.xx.64.x
-A POSTROUTING -j SNAT -o ethX --to-source xx.xx.64.x

seems better to change to

-A POSTROUTING -j SNAT -o eth1 --to-source xx.xx.64.x
-A POSTROUTING -j SNAT -o ethX --to-source xx.xx.96.x

or

-A POSTROUTING -j SNAT -o eth1 --to-source xx.xx.64.x
-A POSTROUTING -j SNAT -o ethX -d xx.xx.96.1 --to-source xx.xx.96.x  (96.1 is gateway)
-A POSTROUTING -j SNAT -o ethX ! -d xx.xx.96.1 --to-source xx.xx.64.x  (96.1 is gateway)

to be discussed

weizhouapache avatar Aug 20 '24 09:08 weizhouapache

minor issue, moving forward

DaanHoogland avatar Feb 03 '25 14:02 DaanHoogland

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12440

blueorangutan avatar Feb 13 '25 12:02 blueorangutan

@DaanHoogland any update on this one?

Pearl1594 avatar Feb 28 '25 17:02 Pearl1594

@Pearl1594 , no priority and no definite proof it works yet. I will revisit later. cc @weizhouapache

DaanHoogland avatar Mar 02 '25 10:03 DaanHoogland

tested this PR with a VPC

  • source nat: 10.0.53.6
  • other public ips on same subnet: 10.0.53.20, 10.0.53.3
  • public ips on different subnet: 10.0.64.110/111/112

main difference with iptables

 -A POSTROUTING -o eth1 -j SNAT --to-source 10.0.53.6       < ========= same (with Source NAT IP)

--A POSTROUTING -o eth2 -j SNAT --to-source 10.0.53.6       < ========== removed

+-A POSTROUTING -o eth1 -j SNAT --to-source 10.0.53.20    <========= new rules (with other public IPs)
+-A POSTROUTING -o eth1 -j SNAT --to-source 10.0.53.3
+-A POSTROUTING -o eth2 -j SNAT --to-source 10.0.64.111
+-A POSTROUTING -o eth2 -j SNAT --to-source 10.0.64.110
+-A POSTROUTING -o eth2 -j SNAT --to-source 10.0.64.112

the issue #8562 fixed by #8599 will come back they are two different cases. it looks difficult to make both work ...

weizhouapache avatar Apr 17 '25 13:04 weizhouapache

@weizhouapache , sounds like it is impossible (both put snat for the secondary ip on its own interface and on the primary interface)

So how about making the health check script accept this situation somehow?

DaanHoogland avatar Apr 17 '25 14:04 DaanHoogland

after several discussion with @DaanHoogland and @sureshanaparti it seems difficult to fix the issue with iptables rules

We could change the severity level to warning when #10710 is merged moving to 4.22 milestone

weizhouapache avatar Sep 08 '25 07:09 weizhouapache