python-iptables icon indicating copy to clipboard operation
python-iptables copied to clipboard

Performance issue on ARM architect when perform lookup and modify rule

Open dolenam317 opened this issue 6 years ago • 3 comments

Hi,

I have a custom chain where packets are marked for further processing. What I am trying to do is

  • Look up for a special rule base on its specification (protocol, source IP, destination IP ...). I had another interface to perform the scanning before calling my script so there aren't duplicated rules
  • If a rule is found, it is replaced with another rule. They are basically the same rule except for their firewall mark. Then I delete the next rule which is a rule that help the packet match the aforementioned rule escaping my custom chain for further processing.

After profiling and I saw the function refresh has been called a lot. It takes about 30% of execution time and on our ARM box it takes about 4 seconds to run.

I attached the profiling output here. Could you please give me some advice for this matter ? enable_a_firewall_rule.txt

dolenam317 avatar Dec 02 '19 12:12 dolenam317

Can you share your code? refresh() is called from a few places, it's hard to pinpoint the problem without seeing more.

ldx avatar Dec 02 '19 17:12 ldx

Can you share your code? refresh() is called from a few places, it's hard to pinpoint the problem without seeing more.

Hi please find the setup at: https://drive.google.com/file/d/1EXlkHJ-IpCVtTLhyyJtiK6KJLT1yINAV/view?usp=sharing

Step to reproduce:

  • Extract the tar file to some directory ./fw-setup.py -a restore ./fw-setup.py -a enable_fw_rule -p udp --dest_ip 52.57.100.58/32 --dest_port 27641 --fw_policy accept --firewall --table raw --chain sta_firewall ./fw-setup.py -a disable_fw_rule -p udp --dest_ip 52.57.100.58/32 --dest_port 27641 --fw_policy accept --firewall --table raw --chain sta_firewall

dolenam317 avatar Dec 05 '19 04:12 dolenam317

Hi @dolenam317 , Are you still actively working on this?

I took a look at your code and I'm glad to see that you are using the easy module :) While I believe the excessive refresh() you saw could be triggered by the easy module, your code is not installing so many rules to explain the 3.5 sec execution time.

I ran your code on a VM with Debian 10 (buster) following your instructions but unfortunately I couldn't reproduce the issue. My execution time is 0.1 sec, here is an excerpt

# python3 -m cProfile -s time fw-setup.py --firewall -a enable_fw_rule --protocol tcp  --dest_ip 34.249.103.184 --dest_port 27641 --table raw --chain sta_firewall --fw_policy accept
         75482 function calls (74549 primitive calls) in 0.101 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       15    0.014    0.001    0.014    0.001 {method 'read' of '_io.BufferedReader' objects}
       44    0.006    0.000    0.006    0.000 {built-in method posix.read}
       13    0.004    0.000    0.004    0.000 {method 'poll' of 'select.poll' objects}
  254/236    0.004    0.000    0.007    0.000 {built-in method builtins.__build_class__}
       44    0.004    0.000    0.004    0.000 {built-in method marshal.loads}
      264    0.004    0.000    0.004    0.000 {built-in method builtins.dir}
       20    0.003    0.000    0.003    0.000 {built-in method _posixsubprocess.fork_exec}
       21    0.003    0.000    0.003    0.000 {method 'search' of 're.Pattern' objects}
        5    0.003    0.001    0.003    0.001 {built-in method _imp.create_dynamic}
       20    0.002    0.000    0.002    0.000 {built-in method posix.waitpid}
    57/31    0.002    0.000    0.004    0.000 sre_parse.py:475(_parse)
       20    0.001    0.000    0.012    0.001 subprocess.py:1383(_execute_child)
1825/1632    0.001    0.000    0.004    0.000 ip4tc.py:458(__setattr__)
      981    0.001    0.000    0.001    0.000 ip4tc.py:1638(is_chain)
      134    0.001    0.000    0.003    0.000 <frozen importlib._bootstrap_external>:1356(find_spec)
      716    0.001    0.000    0.001    0.000 __init__.py:489(cast)
       70    0.001    0.000    0.008    0.000 ip4tc.py:672(__init__)
       70    0.001    0.000    0.014    0.000 ip4tc.py:1341(_set_rule)
     5671    0.001    0.000    0.001    0.000 {built-in method builtins.isinstance}
      300    0.001    0.000    0.001    0.000 {built-in method posix.stat}
   130/30    0.001    0.000    0.002    0.000 sre_compile.py:71(_compile)

On another note, I noticed that when adding 100s or 1000s of rules the system might become quite laggy. To this end, the iptc.easy module features a series of batch functions to help reduce the execution time and stress on netlink sockets. You can check them out here, unfortunately I didn't write any documentation for them: https://github.com/ldx/python-iptables/blob/master/iptc/easy.py#L216

Let me know how can I help :)

jllorente avatar Jan 30 '21 20:01 jllorente