core icon indicating copy to clipboard operation
core copied to clipboard

configctl service reload all breaks ipfw/traffic shaping

Open bensmithurst opened this issue 10 months ago • 6 comments

I'm not sure if this is the right fix but I present it here for discussion at least anyway.

At the same time as I noticed the other problem with configctl service reload all (#8194) I noticed that it broke my traffic shaping rules. Basically it ends up stopping/starting ipfw in such a way that pf and ipfw are applied in the wrong order.

To reproduce:

$ sudo pfilctl heads
...
                           In               pf:default-in6
                           In             ipfw:default6
                          Out             ipfw:default6
                          Out               pf:default-out6
...
$ sudo configctl service reload all
OK
$ sudo pfilctl heads
...
                           In             ipfw:default6
                           In               pf:default-in6
                          Out               pf:default-out6
                          Out             ipfw:default6
...

Note that now ipfw is applied before pf on inbound. The result of this is that inbound traffic is traffic shaped by ipfw before being de-NATed by pf and therefore a rule which allows bandwidth based on LAN IP doesn't work. The opposite similarly applies for outbound. I've only shown the IPv6 rules above for brevity but the IPv4 rules are also affected.

Note that sudo /etc/rc.d/ipfw stop; sudo /etc/rc.d/ipfw start will also trigger the problem, it's not unique to configctl.

My proposed fix adds a script called after ipfw is stopped, to remove the first load marker file, so that the next ipfw start is treated as a first load, and pf is toggled on/off to set the order. However we also need to enable ipfw explicitly at that point, otherwise it is still disabled, /etc/rc.d/ipfw only sets the enable flag after executing our rules script.

As I say... maybe there is a better way to do this but I had a bit of a think and couldn't find anything obvious. There are no user defined hooks called by the rc script after it sets the sysctls for example, so I think forcing the enable flag might be needed here.

bensmithurst avatar Jan 08 '25 18:01 bensmithurst

Hi @bensmithurst

Thanks for looking into this issue. Since the whole ipfw implementation has shifted a bit, a similar issue has popped up as a result of making ipfw optional in https://github.com/opnsense/core/commit/3bf818348ca7e306ab4fd5e80a9eca68f98c2167.

This PR predates the shift, but is still very relevant, except the implementation details also changed a bit.

I've opened https://github.com/opnsense/core/pull/8728 which should fix your issue as well if you want to try it out.

swhite2 avatar May 27 '25 12:05 swhite2

@swhite2 thanks for the update, it seems as if the other changes help but still leaves the service reload all case not working nicely. I've updated this PR to simplify it, taking advantage of the updates elsewhere.

As before I'm not sure if this is the best fix, but I can't see any other obvious ways to get the sync_fw_hooks called at the right time in the reload all case. One other option might be a special case in rc.freebsd so that when it is called by service reload all it does something 'special' with ipfw.

bensmithurst avatar May 31 '25 17:05 bensmithurst

Maybe we need ipfw_skip=“YES” to skip the reload side effect. I’m not near any code yet so maybe that’s already there.

fichtner avatar May 31 '25 18:05 fichtner

@bensmithurst You're right, the whole thing is a bit weird to me (i.e. "enabling" ipfw in poststart: https://github.com/opnsense/src/blob/stable/25.1/libexec/rc/rc.d/ipfw#L112-L116). While your fix is correct, this may produce unintended side effects in the long term.

Perhaps the bigger question here is if we can't guarantee proper functioning by skipping configd, is rc.reload_all doing the right thing? Maybe we need to combine <service>_skip="YES" with an explicit configd reload for that service there as well?

swhite2 avatar Jun 02 '25 14:06 swhite2

If we use skip we need to reload manually, yes. But I thought the filter reload already does all of it? Or just missing the ipfw piece now with dnctl already in place?

fichtner avatar Jun 02 '25 14:06 fichtner

It looks like the filter reload code does dnctl but not ipfw. I've pushed another change which might not be right, but might be closer to what you have in mind?

bensmithurst avatar Jun 02 '25 19:06 bensmithurst

@bensmithurst Sorry, this took a little longer than expected, but I think the change is correct. Would appreciate your feedback as well @fichtner.

swhite2 avatar Jun 26 '25 07:06 swhite2

For reference, this PR has been followed up by https://github.com/opnsense/core/commit/358bc256933998772c20956dd528b0fe0e7c1d97 to prevent a configd call in the filter reload. This also unearthed an inconsistency resulting in dnctl failing on bootup which has been addressed there as well.

swhite2 avatar Jun 26 '25 14:06 swhite2