dmd icon indicating copy to clipboard operation
dmd copied to clipboard

Implement null dereference check

Open rikkimax opened this issue 2 months ago • 50 comments

Approved work: https://forum.dlang.org/post/[email protected]

Currently turned on to see what the CI does.

Will request Walter to review once I'm happy with CI.

Heavily inspired by OpenD with thanks to @adamdruppe.

EDIT: Special thanks to @limepoutine for helping to get both function pointers and delegate function pointers to be null checked!

rikkimax avatar Oct 28 '25 09:10 rikkimax

Thanks for your pull request and interest in making D better, @rikkimax! We are looking forward to reviewing it, and you should be hearing from a maintainer soon. Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#22040"

dlang-bot avatar Oct 28 '25 09:10 dlang-bot

Fun fact, lazy parameters can have a null object. Oh well we just can't check for null on the object of a delegate prior to a call.

rikkimax avatar Oct 28 '25 10:10 rikkimax

So a NullPointerError is unrecoverable, right?

nordlow avatar Oct 28 '25 13:10 nordlow

So I take it a NullPointerError unrecoverable, right?

It is an Error. Something has to die in response, like all Error's.

rikkimax avatar Oct 28 '25 13:10 rikkimax

Is the plan to have -check=nullderef=on be on my default or optionally in debug mode or something else?

nordlow avatar Oct 28 '25 14:10 nordlow

There's no technical reason why you can't recover from a NullPointerError in general, though you may choose not to as it can indicate you don't understand program state.

adamdruppe avatar Oct 28 '25 14:10 adamdruppe

Will using -check=nullderef=on affect performance?

Possibly.

I intend to wire up the fast DFA engine to elide the checks. But can't do it until what does exist is proven to work.

rikkimax avatar Oct 28 '25 14:10 rikkimax

The performance impact is very small. The cpu's branch predictor handles it well and ldc's code generator is decent at removing dead checks and arranging the code blocks in a cache-friendly manner. (dmd's isn't so good but ldc is known to optimize better than dmd in many ways)

adamdruppe avatar Oct 28 '25 14:10 adamdruppe

So a single unittest in libdparse at

https://buildkite.com/dlang/dmd/builds/43787/steps/canvas?sid=019a2b1c-9d02-4993-b9df-dfb58a18783d

fails buildkite/dmd.

The libdparse/test/fail_files/killer2.d contains

unittest
{
version (53056)[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
union {}
Klse
gosh ÿ 'A+ SPL : tape];
		}
	}
}

. Could the CI failure be caused by the compiler existing with error code other than what the tests expects? I also wonder how these seemingly random test cases where deduced...

nordlow avatar Oct 29 '25 09:10 nordlow

I'm aware, but the problem will be in the compiled libdparse, and without compiling it locally I won't be figuring out which emitted hook did it.

As far as I know this could be related to XMM intrinsics, and since those are more of a known problem they're up first.

rikkimax avatar Oct 29 '25 10:10 rikkimax

It's a lot healthier now. The disabling of intrinsic arguments from having the checks added has worked.

However, for the memoryerror_null_read test in druntime, I'm only guessing that adding that rule at the bottom won't trigger the file to be tested in the makefile.

rikkimax avatar Oct 29 '25 11:10 rikkimax

Okay automem was failing due to unit-threaded which was using lazy. So looks like all this pointer's cannot be checked.

rikkimax avatar Oct 29 '25 11:10 rikkimax

Okay automem was failing due to unit-threaded which was using lazy. So looks like all this pointer's cannot be checked.

Do you mean we need to exclude lazy parameters from checking?

nordlow avatar Oct 29 '25 12:10 nordlow

Okay automem was failing due to unit-threaded which was using lazy. So looks like all this pointer's cannot be checked.

Do you mean we need to exclude lazy parameters from checking?

We couldn't do that, since they are delegates under the hood that can be copied.

Right now I'm disabling all this pointer checks for a function call.

I have a solution for this but I'll need to go to a meeting to present it. When a lazy delegate gets constructed and it doesn't need a context pointer, instead of setting it to 0, set it to size_t.max, that'll get around the check quite happily and you can tell in a debugger it's a bad value.

rikkimax avatar Oct 29 '25 12:10 rikkimax

Okay looks like I misread automem backtrace, its meant to be erroring like that it's a genuine null deref :)

Turning back on vtable this pointer check.

rikkimax avatar Oct 29 '25 12:10 rikkimax

Ahhh my old "friend": core.exception.NullPointerError@./src/ocean/util/container/cache/ExpiringLRUCache.d(342): Null pointer dereference

Yup I know exactly what is failing there.

https://github.com/sociomantic-tsunami/ocean/pull/891

Fast DFA was able to catch that prior to me having to disable integral comparisons (couldn't invert the expression for or's).

rikkimax avatar Oct 29 '25 12:10 rikkimax

It looks like there may be a C++ interop problem.

rikkimax avatar Oct 29 '25 12:10 rikkimax

NullPointerError and InvalidPointerError needs to be removed from etc.linux.memoryerror.

Disabling test in there also (should be moved out).

rikkimax avatar Oct 29 '25 12:10 rikkimax

core.exception.NullPointerError@src/dmd/dfa/fast/statement.d(538): Null pointer dereference

Okay that is just hilarious.

This is why it's a good idea to pair static analysis up with these runtime checks. Multiple layers of defence.

rikkimax avatar Oct 29 '25 13:10 rikkimax

Looks like the C++ interop is just fine.

The issue was a null pointer being passed in from C++ into D and was rightly triggering the check.

Disabling the test.

rikkimax avatar Oct 30 '25 15:10 rikkimax

Good. I will certainly not miss needing to fire up a debugger for these. It also makes my life much easier on MMU-less platforms.

Herringway avatar Oct 30 '25 15:10 Herringway

Okay @WalterBright before I turn this off by default and take this PR out of draft, is there anything you need from me for this PR to be merged?

rikkimax avatar Oct 30 '25 16:10 rikkimax

I'm not sure if it's covered by an existing test, but I notice there are no tests for function pointers included here.

Herringway avatar Oct 30 '25 17:10 Herringway

That would be because they are not checked.

rikkimax avatar Oct 30 '25 17:10 rikkimax

I tried to implement it, and the backend error'd out.

rikkimax avatar Oct 30 '25 17:10 rikkimax

wait which problem did you have on function pointers? that one was working for me in opend, it was delegates that were iffy.

adamdruppe avatar Oct 30 '25 17:10 adamdruppe

wait which problem did you have on function pointers? that one was working for me in opend, it was delegates that were iffy.

https://github.com/dlang/dmd/pull/22040/files#diff-54719c55eec4a6cfe3b7b7f4b5289d0b5a0b9d41e8589a023bba4d68921f8be0R5494

Is the code for it. I wrote it up pretty quickly and not really willing to dig in.

rikkimax avatar Oct 30 '25 17:10 rikkimax

I tried to implement it, and the backend error'd out.

Hopefully this can be fixed, but if not, the limitations should be detailed somewhere. "not all pointer dereferences are guaranteed to get a check" is good enough for a changelog entry, but it shouldn't be the only mention.

Herringway avatar Oct 30 '25 17:10 Herringway

Ideally these cases would get fixed. Along with intrinsics getting the checks.

It needs to go to Walter to investigate before we can really make a decision on that front. Either way it needs to land in dmd to get into ldc.

rikkimax avatar Oct 30 '25 17:10 rikkimax

Ouch, I didn't expect Windows needs fixArgumentEvaluationOrder on delegates. Maybe as well revert that.

limepoutine avatar Nov 14 '25 13:11 limepoutine