core
core copied to clipboard
configctl service reload all breaks ipfw/traffic shaping
I'm not sure if this is the right fix but I present it here for discussion at least anyway.
At the same time as I noticed the other problem with configctl service reload all (#8194) I noticed that it broke my traffic shaping rules. Basically it ends up stopping/starting ipfw in such a way that pf and ipfw are applied in the wrong order.
To reproduce:
$ sudo pfilctl heads
...
In pf:default-in6
In ipfw:default6
Out ipfw:default6
Out pf:default-out6
...
$ sudo configctl service reload all
OK
$ sudo pfilctl heads
...
In ipfw:default6
In pf:default-in6
Out pf:default-out6
Out ipfw:default6
...
Note that now ipfw is applied before pf on inbound. The result of this is that inbound traffic is traffic shaped by ipfw before being de-NATed by pf and therefore a rule which allows bandwidth based on LAN IP doesn't work. The opposite similarly applies for outbound. I've only shown the IPv6 rules above for brevity but the IPv4 rules are also affected.
Note that sudo /etc/rc.d/ipfw stop; sudo /etc/rc.d/ipfw start will also trigger the problem, it's not unique to configctl.
My proposed fix adds a script called after ipfw is stopped, to remove the first load marker file, so that the next ipfw start is treated as a first load, and pf is toggled on/off to set the order. However we also need to enable ipfw explicitly at that point, otherwise it is still disabled, /etc/rc.d/ipfw only sets the enable flag after executing our rules script.
As I say... maybe there is a better way to do this but I had a bit of a think and couldn't find anything obvious. There are no user defined hooks called by the rc script after it sets the sysctls for example, so I think forcing the enable flag might be needed here.
Hi @bensmithurst
Thanks for looking into this issue. Since the whole ipfw implementation has shifted a bit, a similar issue has popped up as a result of making ipfw optional in https://github.com/opnsense/core/commit/3bf818348ca7e306ab4fd5e80a9eca68f98c2167.
This PR predates the shift, but is still very relevant, except the implementation details also changed a bit.
I've opened https://github.com/opnsense/core/pull/8728 which should fix your issue as well if you want to try it out.
@swhite2 thanks for the update, it seems as if the other changes help but still leaves the service reload all case not working nicely. I've updated this PR to simplify it, taking advantage of the updates elsewhere.
As before I'm not sure if this is the best fix, but I can't see any other obvious ways to get the sync_fw_hooks called at the right time in the reload all case. One other option might be a special case in rc.freebsd so that when it is called by service reload all it does something 'special' with ipfw.
Maybe we need ipfw_skip=“YES” to skip the reload side effect. I’m not near any code yet so maybe that’s already there.
@bensmithurst You're right, the whole thing is a bit weird to me (i.e. "enabling" ipfw in poststart: https://github.com/opnsense/src/blob/stable/25.1/libexec/rc/rc.d/ipfw#L112-L116). While your fix is correct, this may produce unintended side effects in the long term.
Perhaps the bigger question here is if we can't guarantee proper functioning by skipping configd, is rc.reload_all doing the right thing? Maybe we need to combine <service>_skip="YES" with an explicit configd reload for that service there as well?
If we use skip we need to reload manually, yes. But I thought the filter reload already does all of it? Or just missing the ipfw piece now with dnctl already in place?
It looks like the filter reload code does dnctl but not ipfw. I've pushed another change which might not be right, but might be closer to what you have in mind?
@bensmithurst Sorry, this took a little longer than expected, but I think the change is correct. Would appreciate your feedback as well @fichtner.
For reference, this PR has been followed up by https://github.com/opnsense/core/commit/358bc256933998772c20956dd528b0fe0e7c1d97 to prevent a configd call in the filter reload. This also unearthed an inconsistency resulting in dnctl failing on bootup which has been addressed there as well.