seastar icon indicating copy to clipboard operation
seastar copied to clipboard

reactor: default to oneline backtraces in signal handlers

Open fruch opened this issue 1 year ago • 6 comments
trafficstars

align the handlers with the same logic that is used in the reactor stalls backtrace prints

Ref: https://github.com/scylladb/scylladb/issues/16922

fruch avatar Jan 22 '24 21:01 fruch

I gave up trying to pass the reactor_options into the signal handlers

I think of possible two direction

  1. pass way is via global variables
  2. duplicate the handlers - multiline handlers, oneline handlers

fruch avatar Jan 23 '24 13:01 fruch

Some of the CI runs failed because of compiler warnings on unused functions and such. Please look into these failures.

nyh avatar Jan 23 '24 16:01 nyh

align the handlers with the same logic that is used in the reactor stalls backtrace prints

Ref: scylladb/scylladb#16922

I am missing some context about this patch:

  1. What's wrong with multi-line backtraces, especially in the case of a one-off abort. I see some comments in the ScyllaDB issue about some ScyllaDB-specific testing tools that don't like the multi-line format, but it's not a very convincing reason.
  2. Why did we reach the current situation that we have both formats? The commit that introduced the one-line backtrace for reactor stalls, 4006d78ae88630540d1dd6ad4bb084c212901733, by @bhalevy doesn't explain, or why he also wanted to keep the old format. I want to understand that the current mishmash - stalls printed in one format and crashes in another format - was never intentional.
  3. Do our tools, especially Seastar's scripts/addr2line.py and @bhalevy 's backtracing website, support the one-line format? (maybe they already do, but it should be tested, and mentioned).

nyh avatar Jan 23 '24 16:01 nyh

align the handlers with the same logic that is used in the reactor stalls backtrace prints

Ref: scylladb/scylladb#16922

I am missing some context about this patch:

  1. What's wrong with multi-line backtraces, especially in the case of a one-off abort. I see some comments in the ScyllaDB issue about some ScyllaDB-specific testing tools that don't like the multi-line format, but it's not a very convincing reason.

It was agreed 4 years ago, you can see the rest of the discussion here: https://github.com/scylladb/scylladb/issues/5464

There was agreement almost from side to side that one line backtraces are better

And evens idea and suggestions on how to compress it even further

It's not that the tool doesn't like it, it adds complexity which is slowing those tools down, and they need to handle cases of interlaced backtraces as one example (yes it was relevert to stalls more the signal handlers), but again it is something that was agreed almost 4 years ago

  1. Why did we reach the current situation that we have both formats? The commit that introduced the one-line backtrace for reactor stalls, 4006d78ae88630540d1dd6ad4bb084c212901733, by @bhalevy doesn't explain, or why he also wanted to keep the old format. I want to understand that the current mishmash - stalls printed in one format and crashes in another format - was never intentional.

I don't know if it was intentional or not, I think @bhalevy cared more about reactor stalls in the context of performance testing and analysis

  1. Do our tools, especially Seastar's scripts/addr2line.py and @bhalevy 's backtracing website, support the one-line format? (maybe they already do, but it should be tested, and mentioned).

Again @bhalevy probably know the answer to this one, I assume the answer is yes they do.

fruch avatar Jan 23 '24 17:01 fruch

at least now the compilers are happy, but still didn't found a way to get the template working, to avoid the duplication.

fruch avatar Jan 25 '24 11:01 fruch

@scylladb/seastar-maint

can someone give a hand with this one ? the only approach I've been able to conjure here is to duplicate the handlers.

fruch avatar Jun 02 '24 15:06 fruch