dmd
dmd copied to clipboard
Print stack trace for segfaults on linux
https://forum.dlang.org/post/[email protected]
I don't know how to test this, because it requires the test to actually segfault.
Thanks for your pull request and interest in making D better, @dkorpel! We are looking forward to reviewing it, and you should be hearing from a maintainer soon. Please verify that your PR follows this checklist:
- My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
- My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
- I have provided a detailed rationale explaining my changes
- New or modified functions have Ddoc comments (with
Params:andReturns:)
Please see CONTRIBUTING.md for more information.
If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.
Bugzilla references
Your PR doesn't reference any Bugzilla issue.
If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.
Testing this PR locally
If you don't have a local development environment setup, you can use Digger to test this PR:
dub run digger -- build "master + dmd#15331"
Should be fine to test with a .sh
I don't think it is a good idea to enable this by default. These signal handlers are generally iffy and this one in particular worries me in a few ways:
-
What happens if there's multiple threads and one of them triggers the fault while the others are running? Signal handlers, generally speaking, can run in any random thread that didn't mask it out. Since it uses all gshared stuff my thought is it'd at least work as intended - the assert error is thrown from the fault site and that either terminates the thread (or is caught and recovers) and the other threads keep executing - but you'd want to confirm it.
-
What happens if you run the program in a debugger, or on a distro with automatic crash reporting? This would likely cause a significant usability regression in that case, but this is of course mitigated by always using --DRT-trapExceptions=0 in debuggers anyway. We'd need a way to turn it back off (tho of couse setting the default signal handler back would do it, having a function in the lib to do that for you and documenting it would probably be good)
-
What if the fault occurs inside a C function? I again expect it to work in practice but would like confirmation.
Thanks, these are the critical questions I was looking for.
These signal handlers are generally iffy and this one in particular worries me in a few ways:
You obviously shouldn't get a segfault in the first place, but if one happens, how would this signal handler be worse than the default behavior?
What happens if there's multiple threads and one of them triggers the fault while the others are running?
I'll test this
or is caught and recovers
Hopefully not, segfaults shouldn't be recovered from
What happens if you run the program in a debugger
I tried with gdb, and it still catches the SIGSEGV itself
Program received signal SIGSEGV, Segmentation fault.
_Dmain () at ../test/test_.d:9
9 int a = *p; // segmentation fault: null pointer read/write operation
or on a distro with automatic crash reporting
I don't know how those work, but if they recognize crashes caused by range violations or other D assert trips, it will also work with this signal handler, and if they don't, then they don't work with D programs in general.
having a function in the lib to do that for you and documenting it would probably be good
Yes, I can add a deregisterMemoryErrorHandler2 function as well.
What if the fault occurs inside a C function? I again expect it to work in practice but would like confirmation.
I tried separately compiling an endlessly recursing c function with gcc, it works as expected.
I've not enabled it by default anymore, and added a test.
I've tested the handler in a separate thread, and it works for null pointers, but stack overflow doesn't. I think the alt stack only works for the thread you call the register function on.
Another lament against the memoryerror module is it's not portable across compiler vendors, let alone architectures.
Another lament against the memoryerror module is it's not portable across compiler vendors, let alone architectures.
Can you not even import the module in gdc/ldc? The new handler doesn't use inline Assembly so it should be more portable, as long as you get the right stack pointer register from context.uc_mcontext.gregs
Another lament against the memoryerror module is it's not portable across compiler vendors, let alone architectures.
Can you not even import the module in gdc/ldc? The new handler doesn't use inline Assembly so it should be more portable, as long as you get the right stack pointer register from
context.uc_mcontext.gregs
Checked against your branch, and the answer is still no.
druntime/src/etc/linux/memoryerror.d:233:13: error: undefined identifier ‘naked’
233 | naked;
| ^
druntime/src/etc/linux/memoryerror.d:237:18: error: found ‘RBP’ when expecting ‘:’
237 | push RBP;
| ^
druntime/src/etc/linux/memoryerror.d:238:17: error: found ‘RBP’ when expecting ‘:’
238 | mov RBP, RSP;
| ^
druntime/src/etc/linux/memoryerror.d:238:20: error: expected constant string constraint for operand, not ‘,’
238 | mov RBP, RSP;
| ^
druntime/src/etc/linux/memoryerror.d:240:17: error: found ‘sigsegvDataHandler’ when expecting ‘:’
240 | jmp sigsegvDataHandler;
| ^
Just the tip of the wall of errors, but all asm-related.