ser2net
ser2net copied to clipboard
ser2net 4.3.8 fails to open TCP ports at boot
I've defined a port with accepter: tcp,1234
. When starting ser2net at boot, it complains:
Invalid accepter port name/number 'tcp,1234': Unable to find a valid name on the name server
Unfortunately, it doesn't exit afterwards, but continues to run, without ever opening the TCP port. Simply restarting ser2net later makes it work correctly.
Not sure why it needs to ask a name server in the first place, since I didn't specify a hostname anywhere and just expect it to bind to all interfaces. But in any case it shouldn't end up in an invalid state and either retry opening the port later or exit immediately.
On Tue, Sep 20, 2022 at 05:52:15AM -0700, Alexander Steffen wrote:
I've defined a port with
accepter: tcp,1234
. When starting ser2net at boot, it complains:Invalid accepter port name/number 'tcp,1234': Unable to find a valid name on the name server
Unfortunately, it doesn't exit afterwards, but continues to run, without ever opening the TCP port. Simply restarting ser2net later makes it work correctly.
I'm guessing this is a known problem with the systemd startup of ser2net. ser2net needs to start after networking, or the network lookups with getaddrinfo() (even just a port number) fail. Unfortunately, many OS vendors missed that. I don't know much about systemd, but there are many discussions on this in the issues and in the mailing list.
Not sure why it needs to ask a name server in the first place, since I didn't specify a hostname anywhere and just expect it to bind to all interfaces. But in any case it shouldn't end up in an invalid state and either retry opening the port later or exit immediately.
That's a difficult design decision. People have complained both ways. ser2net can support multiple devices. If one fails, do you shut it down even though the others work? And what about a config reload? If there's something wrong on a reloaded config, would it exit?
I understand your concern, and it's really not any harder to do it either way, but I think the current design is the best.
-corey
ser2net needs to start after networking
The service file that is used here has an After=network.target
, but that does not seem to be sufficient?
If one fails, do you shut it down even though the others work?
In my case, all of them fail, so at least that could be a condition to exit, since without any open ports it is rather useless to keep running. But yes, in general I expect processes to exit when they detect a configuration error, unless perhaps it is explicitly marked as optional. Which could be another solution: add a flag to make the behavior configurable, either a global --exit-on-failure
, or a per-device must-not-fail
(or optional
) flag.
If there's something wrong on a reloaded config, would it exit?
I'd say in that case the correct behavior is to keep running with the old config, not with a partially applied new config.
I think the current design is the best
What is wrong with retrying every five seconds for the next five minutes or so? From a usability perspective that seems to me the best of all the solutions that we have discussed so far.
On Tue, Sep 20, 2022 at 12:35:54PM -0700, Alexander Steffen wrote:
ser2net needs to start after networking
The service file that is used here has an
After=network.target
, but that does not seem to be sufficient?If one fails, do you shut it down even though the others work?
In my case, all of them fail, so at least that could be a condition to exit, since without any open ports it is rather useless to keep running. But yes, in general I expect processes to exit when they detect a configuration error, unless perhaps it is explicitly marked as optional. Which could be another solution: add a flag to make the behavior configurable, either a global
--exit-on-failure
, or a per-devicemust-not-fail
(oroptional
) flag.
Yeah, I was thinking that perhaps an option would be a good idea.
If there's something wrong on a reloaded config, would it exit?
I'd say in that case the correct behavior is to keep running with the old config, not with a partially applied new config.
That's much harder to do than you might imagine. You don't know the configuration is bad until you try to use it, and you can't use it until you shut down the old configuration. And if a port is in use, the new config is delayed until the port is free, so you won't know until then.
For anything beyond syntax errors, this is practically impossible.
I think the current design is the best
What is wrong with retrying every five seconds for the next five minutes or so? From a usability perspective that seems to me the best of all the solutions that we have discussed so far.
Well, I had never imagined a situation like this one when I originally wrote it. I assumed that if it failed, it was going to continue to fail. Beyond this one weird situation you won't really get something where retrying will help. Except for an IP port conflict where the other user of the port is transient.
I think the config option is the right way to go.
Thanks,
-corey
Does the service file at https://github.com/cminyard/ser2net/issues/60#issuecomment-1124070221 work for you?
You don't know the configuration is bad until you try to use it, and you can't use it until you shut down the old configuration.
True. You'd need to save the old configuraton before applying the new one, so that you can switch back to it in case the new one is broken. And if the old configuration also does not work anymore, then you can give up and exit.
I think the config option is the right way to go.
Sounds fine to me. It won't prevent the failures, but at least they will get detected and fixed automatically (by restarting the service).
Does the service file at #60 (comment) work for you?
Works as a workaround, but seems to go against what the systemd documentation says:
It is strongly recommended not to pull in this target [network-online.target] too liberally: for example network server software should generally not pull this in (since server software generally is happy to accept local connections even before any routable network interface is up)
But it seems, for some reason, unlike other "server software" ser2net cannot open TCP ports, not even on localhost, without all network interfaces fully up and running?
Hi all, I just want to mention that we are also seeing this issue (again) on Raspbian's version of 4.3.3 of ser2net.
Previously the fix described in https://github.com/cminyard/ser2net/issues/60#issuecomment-1124070221 did work. Unfortunately it no longer seems to be working for me - I am still investigating why.
Just I just wanted to add I definitely think a per-connection option like must-not-fail
and then exiting on failure is a great idea. The current design makes it extremely difficult to properly handle failures.
On Tue, Sep 20, 2022 at 05:52:15AM -0700, Alexander Steffen wrote:
I've defined a port with
accepter: tcp,1234
. When starting ser2net at boot, it complains:Invalid accepter port name/number 'tcp,1234': Unable to find a valid name on the name server
Unfortunately, it doesn't exit afterwards, but continues to run, without ever opening the TCP port. Simply restarting ser2net later makes it work correctly.
Not sure why it needs to ask a name server in the first place, since I didn't specify a hostname anywhere and just expect it to bind to all interfaces. But in any case it shouldn't end up in an invalid state and either retry opening the port later or exit immediately.
If you start ser2net after the system is booted and it works ok, this is a known issue, but not really with ser2net. If you start ser2net before networking is available, gethostbyname() will always fail, and that's how ser2net translates names. And even if that failed it would fail attempting to open the socket.
You can search through the issues for various solutions. You need to delay the start of ser2net to after bringing networking up somehow.
-corey
On Tue, Sep 20, 2022 at 05:52:15AM -0700, Alexander Steffen wrote: I've defined a port with
accepter: tcp,1234
. When starting ser2net at boot, it complains: > Invalid accepter port name/number 'tcp,1234': Unable to find a valid name on the name server Unfortunately, it doesn't exit afterwards, but continues to run, without ever opening the TCP port. Simply restarting ser2net later makes it work correctly. Not sure why it needs to ask a name server in the first place, since I didn't specify a hostname anywhere and just expect it to bind to all interfaces. But in any case it shouldn't end up in an invalid state and either retry opening the port later or exit immediately. If you start ser2net after the system is booted and it works ok, this is a known issue, but not really with ser2net. If you start ser2net before networking is available, gethostbyname() will always fail, and that's how ser2net translates names. And even if that failed it would fail attempting to open the socket. You can search through the issues for various solutions. You need to delay the start of ser2net to after bringing networking up somehow. … -corey
I shouldn't respond to email when I'm half asleep. I saw this and didn't see that this was already responded to and first in a series of messages.
Hi @cminyard thanks for the quick reply. I've just opened a PR #84 for a basic implementation of the must-not-fail
option that @webmeister suggested. Initial testing shows it seems to effectively solve this issue.
I totally get that the underlying issue is that networking is not ready - telling systemd to wait for this did work for me in the past, but now I'm finding even this no longer works - as well as lots of other things, as I mention in #84 as my justification for adding the option. So this just seems like the most reliable way - before this I literally had to resort to adding a cronjob to check and see if ser2net is actually listening and if not, restart.
Would love some feedback from you - if you think the behaviour is not right happy to adjust it as needed.
This is fixed a different way, per discussions in PR #84.