meson icon indicating copy to clipboard operation
meson copied to clipboard

Sanitizer enhancements

Open dcbaker opened this issue 4 years ago • 19 comments

This little series makes a serious overhaul of the way b_sanitize works.

First, some cleanup around spurious warnings for sanitizer options.

Then, instead of being a combo option, it becomes a free form array, and we do an internal compiler.has_arguments check to see if the compiler actually supports the arguments that we've asked for. Using an array, rather than a string, has some nice properties for writing meson build files. With a string, it's tempting to write things like:

if get_option('b_sanitize') == 'address,undefined'
  ...
endif

But this isn't actually safe anymore, since it well could be 'undefined,address'. It also heads off the possibility of a compiler that doesn't use a , as a separator, say something that takes -Osanitize=address -Osanitize=undefined.

But, in order to preserve backwards compatibility, the get_option('b_sanitize') has been specialized to return a sorted string value if the meson compatibility version is < 0.62 (< 0.61.99 at the moment, because it's impossible to test otherwise) and a list if it is >= 0.62. This means that if a user with 0.62 sets -Db_sanitize=['undefined', 'address'] then get_option() will still return address,undefined. This doesn't fix cases of adding new, previously unsupported sanitizers into the mix, ['undefined', 'address', 'fuzzer'] will still trip things up.

Fixes: #9822 Fixes: #7192 Fixes: #8283 Fixes: #7761 Fixes: #5154 Fixes: #1582

dcbaker avatar Jan 13 '22 19:01 dcbaker

@bonzini, You also took a stab at this in the past, so I thought you might be interested

dcbaker avatar Jan 13 '22 19:01 dcbaker

specialized to return a sorted string value if the meson compatibility version is < 0.62 (< 0.61.99 at the moment, because it's impossible to test otherwise) and a list if it is >= 0.62.

This makes me so incredibly nervous, it's basically cmake policies.

eli-schwartz avatar Jan 13 '22 19:01 eli-schwartz

We're kinda in between a rock an a hard place. Our options seem to be:

  1. accept the b_sanitize is useless and deprecate it, make it harder for developers and projects that need to know how to pass sanitizers to each compiler
  2. pretend everything works
  3. list ever possible option for each compiler/liniker in every possible order
  4. do this
  5. do mostly this, but make it impossible to do anything with the sanitizers in a way that works for both meson >= 0.62 and < 0.62 without explicitly rewriting them (ie, breaking all old checkouts with meson 0.62)
  6. do mostly this, but always return a string value

In my mind 4 is the best choice, and 1 is the next least worst. But we really have to stop intentionally breaking backwards compatibility. Short of this sort of versioned approach or going to a cmake style (gross) policy system, I'm not sure how to solve the issue.

6 is next, but that just means that all users forever will have to split the output to get it in a form useful to work with.

dcbaker avatar Jan 13 '22 20:01 dcbaker

Both string and array do have .contains().

Only the latter has the in operator.

Splitting it because you expect it to be a string and want to convert it to an array would indeed break and cause existing meson.build files to break strangely.

...

You are correct that none of the options really seem to inspire tremendous enthusiasm.

Shower thought: could we somehow warn any time get_option('b_sanitize') is checked with == instead of .contains()? WARNING: b_sanitize will change to an array when minimum meson_version >=0.62. Use .contains() to ensure it works either way. Probably not...

eli-schwartz avatar Jan 13 '22 20:01 eli-schwartz

Codecov Report

Attention: Patch coverage is 63.93443% with 22 lines in your changes missing coverage. Please review.

Project coverage is 55.81%. Comparing base (29c26d5) to head (0e22712). Report is 2223 commits behind head on master.

Files Patch % Lines
mesonbuild/interpreter/interpreter.py 12.50% 5 Missing and 2 partials :warning:
mesonbuild/compilers/rust.py 44.44% 5 Missing :warning:
mesonbuild/compilers/compilers.py 76.47% 2 Missing and 2 partials :warning:
mesonbuild/compilers/mixins/visualstudio.py 50.00% 1 Missing and 1 partial :warning:
mesonbuild/linkers/linkers.py 77.77% 1 Missing and 1 partial :warning:
mesonbuild/backend/vs2010backend.py 0.00% 1 Missing :warning:
mesonbuild/compilers/mixins/gnu.py 80.00% 0 Missing and 1 partial :warning:

:exclamation: There is a different number of reports uploaded between BASE (29c26d5) and HEAD (0e22712). Click for more details.

HEAD has 10 uploads less than BASE
Flag BASE (29c26d5) HEAD (0e22712)
19 9
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #9825       +/-   ##
===========================================
- Coverage   68.71%   55.81%   -12.90%     
===========================================
  Files         406      406               
  Lines       87766    87793       +27     
  Branches    19518    19360      -158     
===========================================
- Hits        60311    49006    -11305     
- Misses      22892    34658    +11766     
+ Partials     4563     4129      -434     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jan 13 '22 23:01 codecov[bot]

That would actually be fairly trivial to do in meson++ with it's IR, lol. I think that would be quite challenging for an interpreter though, there isn't a clean, easy way to declare that you're comparing the output of get_option('b_sanitize'). maybe @mensinda knows how to do it?

dcbaker avatar Jan 13 '22 23:01 dcbaker

Actually, we already do this for class MesonVersionStringHolder(StringHolder), to override its str.version_compare() method (so that those suppress FeatureNew warnings in conditional blocks).

eli-schwartz avatar Jan 14 '22 00:01 eli-schwartz

for MesonVersionStringHolder that's a small amount of technical debt for a huge payoff. I wonder if it's worth the additional technical debt. Maybe it would be easier to add in to strings?

dcbaker avatar Jan 18 '22 19:01 dcbaker

I think I am going to be targeting Meson 0.62 as the next bump for my project. Would be sweet to get this in for that release assuming it is feasible.

tristan957 avatar Jan 26 '22 08:01 tristan957

There might be another way to do it, which is to name the new option b_sanitizers and return the string get_option('b_sanitize'). Support for -Db_sanitize can be dropped.

bonzini avatar Feb 04 '22 14:02 bonzini

Maybe it would be easier to add in to strings?

I prefer not to have it. Looking for substrings tends to be an antipattern. I'm worried that people would write in instead of the more correct startswith/endswith just because it's more compact.

bonzini avatar Feb 04 '22 14:02 bonzini

I've rebased this and done the 0.61 -> 0.62 and 0.62 -> 0.63 dance. I think there's probably still a few CI failures to sort out.

But, let's talk about the options we have not breaking everything.

  1. continue to do what I did, of returning a sorted, comma joined string for < 0.63, an array otherwise
  2. return an array, and point out that .contains() will work in both cases, accept some old projects will break
  3. implement a special container type that does the coercion of ==, in, etc, and deprecate those comparisons
  4. replace b_sanitize with b_sanitize2 5 some other option I'm not thinking of

dcbaker avatar Jun 06 '22 22:06 dcbaker

This pull request introduces 1 alert when merging cffd7874344f440e3e4e0a52fddc2facd4ecf2ed into 29c26d5b26397eaa3606d22d71309ecd1eb6b223 - view on LGTM.com

new alerts:

  • 1 for Wrong number of arguments in a call

lgtm-com[bot] avatar Jun 06 '22 22:06 lgtm-com[bot]

I only like options 1 and 2. Option 2 is probably my favorite one. I don't consider the other options viable.

tristan957 avatar Jun 06 '22 22:06 tristan957

My favorite is 4/2/1, but I am worried that 2 is not a good idea. For option 4, I would not bother with supporting -Db_sanitize, unless it turns out to be easy.

bonzini avatar Jun 06 '22 22:06 bonzini

My thought for b_sanitize2 (or b_sanitize_array?) is that it would return an array, while b_sanitize would continue to return a string which would always match the values we guarantee, so if you use a new value like "return" then it would just be ignored by b_sanitize, since you wouldn't be able to pass that anyway.

dcbaker avatar Jun 06 '22 23:06 dcbaker

My idea was:

*Name it b_sanitizers which is better anyway than b_sanitize *Do not allow setting b_sanitize anymore on the command line to simplify the implementation *Make get_option('b_sanitize') return the join of the elements of get_option('b_sanitizers')

bonzini avatar Jun 06 '22 23:06 bonzini

yup, that makes sense to me. I'll do that.

dcbaker avatar Jun 06 '22 23:06 dcbaker

Oh by the way, since commit e19e9ce6f196f7c127a2668b5df0ada1d50806df the return value of get_option('b_sanitize') is now a P_OBJ.OptionString (subclass of str). So, we can actually check that now.

eli-schwartz avatar Sep 11 '22 17:09 eli-schwartz

Resubmitted as #12648 and partially merged. Resubmitted again as #14302 and the final bits were merged.

All done.

eli-schwartz avatar Nov 09 '25 03:11 eli-schwartz