remote-apis icon indicating copy to clipboard operation
remote-apis copied to clipboard

Let exit_code be better aligned with C/POSIX

Open EdSchouten opened this issue 2 years ago • 0 comments

I am glad that REv2 already uses int32 as its data type for exit_code. Even though most UNIX systems historically only had 8-bit exit codes, modern versions of POSIX provide interfaces for obtaining the full integer value. Using functions like waitid(), you can obtain a siginfo_t whose si_status field contains the full value.

Where we do deviate from C/POSIX is that there are no facilities for indicating that a process did not terminate by calling _Exit(), but through a signal. This leads to confusing situations where people wonder why their build actions terminate with exit code 139, while in reality it terminated through SIGSEGV (signal 11, giving exit code 128+11=139). Even worse are modern versions of Go's os/exec package, which will always return an exit code of -1 in case of signal delivery. This means that users can't even figure out why their actions terminated.

One solution to this would be to change the existing field:

  // The exit code of the command.
  int32 exit_code = 4;

To something like this:

  oneof termination_reason {
    // The command terminated by calling the operating system's equivalent
    // of `_Exit()` with the code provided.
    int32 exit_code = 4;

    // The command terminated by the delivery of a signal.
    int32 signal_number = 13;
 }

Even though virtually all operating systems use 9 for SIGKILL and 15 for SIGTERM, C/POSIX make no such requirement. The only place where that does happen, is in the definition of the kill utility, where as part of the X/Open System Interfaces (XSI) they provide a hardcoded list of numerical command line options.

My recommendation would thus be to use this instead:

  oneof termination_reason {
    // The command terminated by calling the operating system's equivalent
    // of `_Exit()` with the code provided.
    int32 exit_code = 4;

    // The command terminated by the delivery of a signal. This field contains the name of
    // the signal, with the "SIG" prefix removed (e.g., "SEGV" for segmentation violations).
    string signal_name = 13;
 }

Unfortunately it's not safe for us to add a oneof to the protocol retroactively, as it would cause exit_code to be zero when signal_name is set. Maybe we can already add this to REv2 in the form of a separate field, with the remark that exit_code is set to a non-zero value if signal_name is set...

EdSchouten avatar Feb 21 '23 09:02 EdSchouten