ponyc icon indicating copy to clipboard operation
ponyc copied to clipboard

Error handling inside asio_event_* functions, checking for API misuse

Open clearyf opened this issue 7 years ago • 4 comments

As part of investigating ponylang/ponyc#2529 I discovered that the process package was triggering errors inside the asio epoll implementation. I haven’t tested on any platforms other than linux, so I’ve no idea what the other asio implementations are doing. The problem was that the process package was closing the pipes to the subprocess, but only later unsubscribing the asio event notification once all three pipes have been closed on the other side (stdin, stdout, stderr). This leaves a race on between two actors, so normally this is what happens:

A: close pipe, fd 10 A: unsubscribe pipe event, fd 10 B: open pipe, kernel returns next available fd, fd 10 B: subscribe pipe event, fd 10

However, there is a race here, this sequence of events happens if the execution of the two actors interleaves:

A: close pipe, fd 10 B: open pipe, kernel returns next available fd, fd 10 B: subscribe pipe event, fd 10 A: unsubscribe pipe event, fd 10

In this case the actor B waiting for events from the second pipe that will never come. The issue here is that unsubscribing a file descriptor which has already been closed leads to races like this. Without adding wrapper classes for all the different kinds of files/pipes/sockets/etc, the next best step is to stop silently swallowing errors in the asio code. epoll returns an error if an attempt is made to remove a closed fd from an epoll set, this error indicates buggy code. Currently all of the pony_asio_event_* functions return void, so it’s not possible for application code to see whether there were any problems. Two straightforward possibilities:

  1. Change the pony_asio_event_* functions that can fail return an error code, and update the calling code to check the error code. iiuc though, this is a ABI break, and code outside of the ponyc repo that calls these functions will break.

  2. Add a BUG() macro or similar to the pony runtime, so if something goes wrong or an API is being misused a message can be printed to stderr. I did a quick experiment here, and in this case current pony master prints lots of epoll_ctl errors using the test program in ponylang/ponyc#2529. This issue would have been noticed much earlier if this feedback was available. Using an assertion to catch errors from epoll_ctl is possible, but risky as it’s a very abrupt change to make that will only be noticed at runtime.

clearyf avatar Mar 03 '18 19:03 clearyf

Discussed on the sync call.

I would tend to think that option :one: is better here. Since these functions are prefixed with pony_ instead of ponyint_, they are considered public functions. So with this in mind, incorrect use of them should be considered a problem with the caller - not a bug in the runtime. This means that error information should be propagated up to the caller as available, allowing the caller to make a decision about what to do next.

So the next step would be to figure out what this change needs to look like.

As you mention, it would be an ABI breaking change, so I think this should probably go through the RFC process so that affected users can chime in (like Wallaroo Labs, who I believe are using these functions from their code).

@clearyf would you be willing to write up a detailed proposal as an RFC to show what you think this C API should look like, and what error information should look like when propagated (ideally in a cross-platform way)?

jemc avatar Mar 07 '18 20:03 jemc

We use them extensively, yes.

SeanTAllen avatar Mar 07 '18 21:03 SeanTAllen

I can try to write up an RFC for this, I'll need to look at a few RFCs to get a feel for what kind of detail is actually required. Hopefully sometime in the next week or two. Are there any RFCs I should look at to see how the RFC process works?

clearyf avatar Mar 11 '18 19:03 clearyf

The process monitor bug portion of this has been fixed. However, the possibility of misusing the API still exists.

SeanTAllen avatar Dec 16 '21 21:12 SeanTAllen