sockpp icon indicating copy to clipboard operation
sockpp copied to clipboard

EINTR and interrupted system calls

Open snej opened this issue 4 years ago • 6 comments

A characteristic of earlier UNIX systems was that if a process caught a singal while the process was blocked in a "slow" system call, the system call was interrupted. The system call returned an error and errno was set to EINTR. ... The problem with interrupted system calls is that we now have to handle the error return explicitly.

—Stevens & Rago, Advanced Programming In The UNIX Environment, 3rd ed., sec. 10.5

This is annoying because you have to wrap some calls in a do...while loop, and confusing because the behavior has changed over time, is different on different platforms, and there are workarounds to make it less annoying that don't (to me) seem to help.

It appears that the calls used by sockpp that are affected are connect, accept, send, recv. The man pages for these all list EINTR among the possible errors. All these need to be wrapped like:

int result;
do {
    result = SYSTEMCALL();
} while (result < 0 && last_error_code() == EINTR);

I have no idea whether any of this applies to Windows. I don't think Windows even has the notion of "signals" in the same sense as Unix. (@borrrden ?)

snej avatar Sep 09 '19 16:09 snej

If you did this, then a ^C wouldn't be able to get you out of the application, right? Nor a SIGTERM from an orderly shutdown. Or the Linux timeout command (which is very handy in an embedded Linux system). You would be forced to do a SIGKILL, I'm guessing.

We can investigate, but I can't help but notice the lead to that paragraph... "A characteristic of earlier UNIX systems..."

Both "earlier" and "Unix". I wonder how may people would use this lib on UNIX (as opposed to Linux) and an early Unix system at that. Maybe circa 1984? :-)

fpagliughi avatar Sep 09 '19 17:09 fpagliughi

I am absolutely not an expert on signals, but I'm pretty sure that a signal that terminates the process will still terminate it even if a thread is blocked in a system call. So SIGINT and SIGTERM should still stop a process.

I think the reason not to loop on SIGINT is that it seems to be the only reliable way to interrupt a blocking I/O call. So there might be clients who want to use signals for that purpose, even though it seems pretty heavy-handed to me, kind of like stomping on the floor to get a record player un-stuck.

So maybe this could be a per-socket flag, i.e. have a stream_socket::set_interruptible(bool) method to configure this behavior, with the default being false? And there'd be an internal method that wraps the do...while loop above after checking that flag.

snej avatar Sep 09 '19 17:09 snej

BTW, I just remembered that I've seen this interruption behavior in our project — I once hit a breakpoint on one thread while a different thread was connecting to a [slow] server, and when I resumed from the breakpoint, the connect call immediately failed with SIGINT.

snej avatar Sep 09 '19 17:09 snej

Yeah. Maybe if you don't catch SIGINT/SIGTERM they still terminate? Can't remember; would need to run a test.

But then the decision would be... should we handle this in the library, or should the library retain the lower-level behavior? If nothing else, if we add a flag, should the low-level behavior be the default?

You know if we do add this, someone, at some point will post an issue claiming that they're sending their process a SIGTERM, but it's not returning from a system call!

Definitely a deciding factor for me would be if this helped make application code more portable between different targets. I honestly have no idea how non *nix systems act.

fpagliughi avatar Sep 09 '19 18:09 fpagliughi

You are correct in your analysis that Windows does not use signals in C. I am not an expert in this area so I don't know by what mechanism they handle this.

borrrden avatar Sep 09 '19 21:09 borrrden

So... I was going to give the speech about how this library is intended to be mostly low-level and efficient... nearly as efficient as the C API in aggregate... and that a *nix programmer should know about EINTR and handle it properly...

And then I looked at the implementation of readn() and writen() in the library, and I totally forgot to handle EINTR!

D'Oh!

fpagliughi avatar Sep 09 '19 21:09 fpagliughi