python-iptables
python-iptables copied to clipboard
Performance issue with iptc.easy.dump_all() on a server with many rules.
Hi,
Disclamer:
First all - I'm fairly new to python and this is my first project with python-iptables - so I'm sorry if this it not the correct place way / place to post this question - but I didn't find another forum. Also I might be missing something very obvious, better way to achieving what I want - please feel free to point it out.
Thank you in advance!
Issue:
I'm running some vpn servers with multiple (hundreds) site-to-site vpn tunnels. Each tunnel has on average 5 to 10 iptables rules - might be more if more endpoints. Every time a tunnel is setup or shutdown a custom script is run to install or delete iptables rules, vti interfaces sand routes. Previously we have used shell scripts - but we want to move to a custom python script now.
I just deployed it to one of our servers - and I noticed that CPU usage shot up on the server. When I run a cprofil of the script I see that the iptc.easy.dump_all() uses a large around of the cumtime.
This happens when we try to get all the rules associated with a specific tunnel by dumping all rules and then picking out the ones with a specific comment.
Function form where iptc.easy.dump_all() is called and the data is used
def get_installed_rules_from_comment(comment):
all_installed_rules = iptc.easy.dump_all()
match_pattern = re.compile(re.escape(comment) + "( |$)")
for table_name, table_chains in all_installed_rules.items():
for chain_name, chain_rules in table_chains.items():
for rule in chain_rules:
if "comment" in rule and re.match(
match_pattern, rule["comment"]["comment"].strip()
):
yield (table_name, chain_name, rule)
Here is the cProfile output:
2797656 function calls (2773068 primitive calls) in 2.943 seconds
Ordered by: cumulative time
List reduced from 2485 to 20 due to restriction <20>
ncalls tottime percall cumtime percall filename:lineno(function)
289/1 0.002 0.000 2.943 2.943 {built-in method builtins.exec}
1 0.000 0.000 2.943 2.943 /scripts/ipsec-common/updowntunnel/updowntunnel_script.py:4(<module>)
1 0.000 0.000 2.663 2.663 /scripts/ipsec-common/updowntunnel/updowntunnel_script.py:97(tunnel_down)
2 0.002 0.001 2.438 1.219 /scripts/ipsec-common/updowntunnel/iptables/iptables.py:21(get_installed_rules_for_tunnel)
7 0.003 0.000 2.434 0.348 /scripts/ipsec-common/updowntunnel/iptables/iptables.py:8(get_installed_rules_from_comment)
2 0.000 0.000 2.422 1.211 /usr/local/lib/python3.7/dist-packages/iptc/easy.py:202(dump_all)
2 0.000 0.000 2.417 1.209 /usr/local/lib/python3.7/dist-packages/iptc/easy.py:204(<dictcomp>)
10 0.000 0.000 2.417 0.242 /usr/local/lib/python3.7/dist-packages/iptc/easy.py:206(dump_table)
10 0.000 0.000 2.415 0.241 /usr/local/lib/python3.7/dist-packages/iptc/easy.py:208(<dictcomp>)
38 0.000 0.000 2.414 0.064 /usr/local/lib/python3.7/dist-packages/iptc/easy.py:210(dump_chain)
38 0.002 0.000 1.373 0.036 /usr/local/lib/python3.7/dist-packages/iptc/easy.py:213(<listcomp>)
3165 0.042 0.000 1.371 0.000 /usr/local/lib/python3.7/dist-packages/iptc/easy.py:317(decode_iptc_rule)
1 0.000 0.000 1.194 1.194 /scripts/ipsec-common/updowntunnel/iptables/iptables.py:27(delete_all_tunnel_rules)
9434 0.062 0.000 1.106 0.000 /usr/local/lib/python3.7/dist-packages/iptc/ip4tc.py:410(get_all_parameters)
38 0.002 0.000 1.031 0.027 /usr/local/lib/python3.7/dist-packages/iptc/ip4tc.py:1498(_get_rules)
38 0.002 0.000 1.021 0.027 /usr/local/lib/python3.7/dist-packages/iptc/ip4tc.py:1504(<listcomp>)
3165 0.003 0.000 1.018 0.000 /usr/local/lib/python3.7/dist-packages/iptc/ip4tc.py:1819(create_rule)
3170 0.005 0.000 1.016 0.000 /usr/local/lib/python3.7/dist-packages/iptc/ip4tc.py:943(__init__)
3170 0.040 0.000 1.011 0.000 /usr/local/lib/python3.7/dist-packages/iptc/ip4tc.py:1327(_set_rule)
9252 0.035 0.000 0.760 0.000 /usr/lib/python3.7/shlex.py:304(split)
Without looking at the rule set is hard to make a guess... But perhaps you don't need to call dump_all() ? After all, that's the most expensive call you can make. Perhaps you could benefit from some iptables rule engineering for such a vast infrastructure.
Say that you have a chain where you add the "hooks" for each tunnel and use deterministic chain names based on the VTI interface? Then you only need to dump a specific table&chain to get all the rules you were interested on.
iptables -A vti_hooks -i vti0 {-m comment --comment "vpn_vti0"} -j vpn_vti0
iptables -A vti_hooks -o vti0 {-m comment --comment "vpn_vti0"} -j vpn_vti0
iptables -A vpn_vti0 -i vti0 {-m comment --comment "vpn_vti0_in"} -j vpn_vti0i ; ingress rules in this chain
iptables -A vpn_vti0 -o vti0 {-m comment --comment "vpn_vti0_out"} -j vpn_vti0o ; egress rules in this chain
Hi @malteohlers , were you able to solve it in the end?
@jllorente - Thanks a lot for following up on this! That is much appreciated. In fact no, we didn't solve the the problem. Maybe I didn't understand you suggestion correctly or I simply lack the knowledge to implemented in an efficient way. When I tried to understand if it would be feasible I seemed to me that the complexity would grow to much - and we would get to many hook rules - as we are dealing with a few different tables/chains per endpoint. Here are the python warpers we use to create the rules per endpoint - sometimes a tunnel might have many endpoints that each has separate rules, this is because we NAT each individual IP:
`MARK_IN_MASK = "0xffffffff"
def mark_traffic_to_endpoint( local_endpoint_ip, server_network_interface, comment, mark_in ): return { "table": "mangle", "chain": "PREROUTING", "rule": { "dst": local_endpoint_ip + "/32", "in-interface": server_network_interface, "comment": {"comment": comment}, "target": {"MARK": {"set-xmark": f"{mark_in}/{MARK_IN_MASK}"}}, }, }
def mark_traffic_to_endpoint_from_localhost(local_endpoint_ip, comment, mark_in): return { "table": "mangle", "chain": "OUTPUT", "rule": { "dst": local_endpoint_ip + "/32", "comment": {"comment": comment}, "target": {"MARK": {"set-xmark": f"{mark_in}/{MARK_IN_MASK}"}}, }, }
def dnat_traffic_to_endpoint(local_endpoint_ip, comment, remote_endpoint_ip): return { "table": "nat", "chain": "PREROUTING", "rule": { "dst": local_endpoint_ip + "/32", "comment": {"comment": comment}, "target": {"DNAT": {"to-destination": remote_endpoint_ip}}, }, }
def dnat_traffic_to_endpoint_from_localhost( local_endpoint_ip, comment, remote_endpoint_ip ): return { "table": "nat", "chain": "OUTPUT", "rule": { "dst": local_endpoint_ip + "/32", "comment": {"comment": comment}, "target": {"DNAT": {"to-destination": remote_endpoint_ip}}, }, }
def source_nat_traffic_to_endpoint( remote_endpoint_ip, tunnel_ifname, comment, source_nat_ip ): return { "table": "nat", "chain": "POSTROUTING", "rule": { "dst": remote_endpoint_ip + "/32", "out-interface": tunnel_ifname, "comment": {"comment": comment}, "target": {"SNAT": {"to-source": source_nat_ip}}, }, } `
and here are the interface rules:
def interface_accept_rule(in_interface, comment): return { "table": "filter", "chain": "INPUT", "rule": { "in-interface": in_interface, "state": {"state": "RELATED,ESTABLISHED"}, "comment": {"comment": comment}, "target": "ACCEPT", }, }
On one of the production servers we currently have 2560 active iptable rules (200+ tunnels).
We have temporally managed the problem by scaling the server and I caching the result of iptc.easy.dump_all() so it's only called once and not mutiple times as was the case before.
But as we are growing more, we would like to solve this problem in a better way. Either from smarter use of iptables rules / chains or improving the python libraries. I'm not a 100% comfortable with sharing all of our internal details here, but I would be very happy if we can discuss this some way :)