tunneldigger
tunneldigger copied to clipboard
session.pre-down is run after the interface is removed
We have the following line in our pre-down hook:
/sbin/brctl delif saarVPN $INTERFACE
With the old version of the broker (before the rewrite, see e.g. https://github.com/ffrl/tunneldigger/) on old kernels (3.16.0), that worked just fine. However, on kernel 4.9.30 and with the latest broker, we now have errors in the log:
(session.pre-down/26205) interface l2tp2221 does not exist!
I am not seeing these errors with the new broker on an old kernel. So it seems the kernel update is the reason here, not the broker update.
That is strange, isn't it? How would the kernel even know already that the tunnel is dead? Does L2TP have in-band signaling for tearing down the tunnel?
Or maybe this is a race condition? If I read that code in hooks.py
correctly, it is asynchronous. So the broker actually goes on and deletes the interface in Tunnel.close
while the hook is still running.
Or maybe this is a race condition? If I read that code in hooks.py correctly, it is asynchronous. So the broker actually goes on and deletes the interface in Tunnel.close while the hook is still running.
This is true and in this case we would need to wait for the hook to finish executing before proceeding with tunnel teardown.
Right. I think, actually, that for our purpose a post-down hook is good enough, but I would have to check. I mean, deleting the interface will remove it from the bridge, so we don't have to do that in the hook.
Yes, for us such behavior was ok as well and this is the reason why I didn't implement "blocking" hooks. Also, a hook could in theory loop forever and block the tunnel from being freed.
Well, but then it shouldn't be called pre-down hook. Removing that hook would be the more honest thing to do.
Also, a hook could in theory loop forever and block the tunnel from being freed.
Well, sure. An admin can always misconfigure their server. The old broker that we still use on our other two servers uses blocking hooks, and that's working just fine. It's not like hooks usually do stuff on the network.
Actually, we have hooks that rely on this blocking behavior. They modify iptables and could mess things up when running concurrently. (Yes, it's a bad hack, but it's needed to work around a bad firewall.) So this [undocumented] change from synchronous to asynchronous hooks could cause trouble for us.
And even for the non-hacky part... if the tunnel is only added to the bridge asynchronously, doesn't that mean that the client could already be sending data into the tunnel before it is even plugged into the bridge?