htop icon indicating copy to clipboard operation
htop copied to clipboard

Implement platform-independent signal lists

Open vidraj opened this issue 2 years ago • 8 comments

This pull request implements platform-independent lists of signals, sorted by the signal number at compilation time, as discussed in bug #1200. The change is implemented for all platforms, but I only tested it on Linux, so more testing is definitely required.

The potentially breaking changes I'm aware of are:

  1. The existing Solaris code does something wacky with #include <signal.h> and I'm not sure how it currently works (it appears not to include signal.h at all), so my change might break that.
  2. NetBSD defines the real-time signals directly in the platform list, while all other platforms only add them when creating the panel. My code removes the special case.
  3. Only signals which differ between platforms are define-protected. It is possible the code will refuse to compile on some platforms. I could define-protect all signals, but didn't, to make sure the compilation fails when the signal.h header is broken or unavailable. It is possible to define-protect all signals save e.g. SIGINT, which should be defined everywhere.
  4. The code relies on signals being actually #defined as numbers, not e.g. as #define SIGINT (SIGHUP+1). If this assumption is incorrect, the code would issue a compile-time warning and produce an unsorted list.

Stylistic and nitpicky issues:

  1. I've never used Autotools before, so while the new code in configure.ac and Makefile.am works on my machine, it may not be entirely correct or idiomatic.
  2. In particular, the test for sort is probably unnecessary, since the test doesn't stop the build if sort is not present, instead setting the SORT variable to just sort instead of the path. The test for sed works the same, but it at least tries to find a version of sed that actually works. It might be better to write it as
    AC_PATH_PROG([SORT], [sort], [no])
    AS_IF([test "x$SORT" = xno], [AC_MSG_ERROR([sort not found in PATH])])
    
    but I'm not sure what the preferred way is.
  3. The sed scripts are arguably ugly. However, it's a tool that's available everywhere and should just work, unlike e.g. Python or Perl. Perhaps AWK would be a better choice, but I don't speak that language.

vidraj avatar Mar 09 '23 18:03 vidraj

Firstly, thanks for your work that proves my idea would work. However I think you made your C preprocessor and sed script a little more complicated than necessary.

I think the HTOP_SIGENTRY macro and the first sed script can be simplified to something like this (the code is untested):

// It is unlikely that braces would appear in SIG* macro
// definitions; hence we use braces as field delimiters.
#define HTOP_SIGENTRY(x) __htop_sigentry__ { x }{ #x }
# In sed

/^$/ d;
# Normalize whitespaces
s/[[:space:]][[:space:]]*/ /g;
# Discard "#line" directives
/^ *#/ d;
/^ *%:/ d;

:start
/__htop_sigentry__ *{ *[^}]* *}{ *[^}]* *}/ {
   s/__htop_sigentry__ *{ *\([^}]*\) *}{ *\([^}]*\) *}/\n__htop_sigentry__{\1}{\2}\n/1;
   D;
   P;
   D;
   b start;
}

# GNU sed prints the last line when 'N' command reaches
# end of file (non-POSIX behavior). Need this workaround before 'N':
$ d;
N;
s/\n/ /1;
b start;

Explorer09 avatar Mar 10 '23 04:03 Explorer09

Thank you for your swift reaction and the comments!

I think the HTOP_SIGENTRY macro and the first sed script can be simplified to something like this (the code is untested):

// It is unlikely that braces would appear in SIG* macro
// definitions; hence we use braces as field delimiters.
#define HTOP_SIGENTRY(x) __htop_sigentry__ { x }{ #x }

That was my initial implementation as well, however, unlikely is not impossible and once I wrote code to at least detect extraneous braces and report an error, it quickly grew beyond the current implementation.

I'd like the code to be robust, especially since braces can enclose complex expressions with multiple statements inside. I can see that somebody could try to use that to e.g. track uses of deprecated signals, so it is not entirely unlikely for such code to appear. At the very least, the code should not break when that happens. So I still prefer my solution, even though it is longer, I believe the error checking is worth it.

That being said, I can redo the pull request with your suggestions if you want – let me know what your preferences are, you're the maintainer here. :-)

# In sed

/^$/ d;
# Normalize whitespaces
s/[[:space:]][[:space:]]*/ /g;
# Discard "#line" directives
/^ *#/ d;
/^ *%:/ d;

:start
/__htop_sigentry__ *{ *[^}]* *}{ *[^}]* *}/ {
   s/__htop_sigentry__ *{ *\([^}]*\) *}{ *\([^}]*\) *}/\n__htop_sigentry__{\1}{\2}\n/1;
   D;
   P;
   D;
   b start;
}

# GNU sed prints the last line when 'N' command reaches
# end of file (non-POSIX behavior). Need this workaround before 'N':
$ d;
N;
s/\n/ /1;
b start;

As written, this requires quadratic space to reach the first sigentry, though a /__htop_sigentry__/ !d fixes that, and the D command starts a new cycle, it has to be replaced by s/[^\n]*\n//. Quick fixes aside, the output needs further processing – these are the first two entries it prints for me:

__htop_sigentry__{# 13 "signals/SignalList.in.c" 3 4 6  # 13 "signals/SignalList.in.c" }{"SIGABRT" }
__htop_sigentry__{# 15 "signals/SignalList.in.c" 3 4 14  # 15 "signals/SignalList.in.c" }{"SIGALRM" }

A possible fix is to add

s/\n[[:space:]]*#[^\n]*//
s/\n[[:space:]]*%:[^\n]*//
s/\n[[:space:]]*??=[^\n]*//

after the last N.

vidraj avatar Mar 12 '23 12:03 vidraj

That was my initial implementation as well, however, unlikely is not impossible and once I wrote code to at least detect extraneous braces and report an error, it quickly grew beyond the current implementation.

I'd like the code to be robust, especially since braces can enclose complex expressions with multiple statements inside. I can see that somebody could try to use that to e.g. track uses of deprecated signals, so it is not entirely unlikely for such code to appear. At the very least, the code should not break when that happens. So I still prefer my solution, even though it is longer, I believe the error checking is worth it.

How is it possible for a SIG* macro to contain braces without causing a syntax error upon use?

I see the possibility for a signal name macro to be defined like (SIGHUP + 1) or even (__do_something _internally__() [[attribute_foo]], 2), but unless it's an array or struct definition, it makes no sense to use braces at all. Correct me if I'm wrong.

As written, this requires quadratic space to reach the first sigentry, though a /__htop_sigentry__/ !d fixes , and the D command starts a new cycle, it has to be replaced by s/[^\n]*\n//.

Okay. Thanks for catching those.

Quick fixes aside, the output needs further processing – these are the first two entries it prints for me:

__htop_sigentry__{# 13 "signals/SignalList.in.c" 3 4 6  # 13 "signals/SignalList.in.c" }{"SIGABRT" }
__htop_sigentry__{# 15 "signals/SignalList.in.c" 3 4 14  # 15 "signals/SignalList.in.c" }{"SIGALRM" }

A possible fix is to add

s/\n[[:space:]]*#[^\n]*//
s/\n[[:space:]]*%:[^\n]*//
s/\n[[:space:]]*??=[^\n]*//

after the last N.

Yes. When I wrote the draft code I forgot to remove the lines beginning with # after joining them with N. But I would write like this:

s/[[:space:]][[:space:]]*/ /g;
s/\n *#[^\n]*//
s/\n *%:[^\n]*//
s/\n *??=[^\n]*//

Explorer09 avatar Mar 12 '23 13:03 Explorer09

How is it possible for a SIG* macro to contain braces without causing a syntax error upon use?

I see the possibility for a signal name macro to be defined like (SIGHUP + 1) or even (__do_something _internally__() [[attribute_foo]], 2), but unless it's an array or struct definition, it makes no sense to use braces at all. Correct me if I'm wrong.

There is (as usual :-) ) a GNU extension for this: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html

vidraj avatar Mar 12 '23 13:03 vidraj

How is it possible for a SIG* macro to contain braces without causing a syntax error upon use? I see the possibility for a signal name macro to be defined like (SIGHUP + 1) or even (__do_something _internally__() [[attribute_foo]], 2), but unless it's an array or struct definition, it makes no sense to use braces at all. Correct me if I'm wrong.

There is (as usual :-) ) a GNU extension for this: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html

Fine, but I have another idea to make it:

#define HTOP_SIGENTRY(x) __htop_sigentry__ $ x $ #x

`, @, $ are the three characters in the ASCII printable set that are unused in the C language (they might appear in string or character literals, but they have no use outside them).

Explorer09 avatar Mar 12 '23 17:03 Explorer09

An alternative might be to test a set of signal IDs for availability, e.g. via sigaddset(3), and then use the ones not returning EINVAL with the name from strsignal(3).

cgzones avatar Jun 01 '23 21:06 cgzones

@cgzones Checking signal availability via such API calls would add overhead in run time; plus configure script can't check those when cross-compiling.

Explorer09 avatar Jun 02 '23 02:06 Explorer09

@cgzones Checking signal availability via such API calls would add overhead in run time;

You only have to do this once at startup and it's not like we are iterating millions of signal names there. I think that cost is negligible.

plus configure script can't check those when cross-compiling.

Which should just trigger a static fallback, we would need anyway in case that API was unavailable for some reason.

Overall, IMO the current version of the patch is a bit over-engineered and a simpler "check for target arch, switch to appropriate list depending on that" would be better.

BenBE avatar Jun 02 '23 08:06 BenBE