codeql icon indicating copy to clipboard operation
codeql copied to clipboard

C++: Fix FPs for cpp/unused-static-function in files that were not extracted completely

Open geoffw0 opened this issue 3 years ago • 9 comments
trafficstars

Fix FPs for cpp/unused-static-function in files that were not extracted completely, e.g. due to a compilation error part way through the file. As the test shows, this may hide some good results as well, but we expect the majority of results in incompletely extracted files to be false positives caused by the incomplete extraction.

geoffw0 avatar Sep 21 '22 09:09 geoffw0

I'm just looking at a possible performance issue on this PR...

geoffw0 avatar Sep 21 '22 10:09 geoffw0

Do we want to run a DCA on this once you've fixed the performance issue?

I'm actually fairly confident, the issue was an uncomplicated cartesian product. I will start a DCA run to be sure.

geoffw0 avatar Sep 21 '22 10:09 geoffw0

Looks like this change removes > 200 results on DCA. I think we should check a subset of those to see if this looks reasonable. What do you say, @geoffw0?

MathiasVP avatar Sep 22 '22 12:09 MathiasVP

Yep, I'm not surprised if it removes a lot of results (and some of them will be TPs), but we should check at least a proportion of them are really FP.

geoffw0 avatar Sep 22 '22 12:09 geoffw0

Do you know where to find the total number of results for cpp/unused-static-function in the DCA run? Losing 236 is a lot, but I have a feeling its a noisy query in the first place.

geoffw0 avatar Sep 22 '22 14:09 geoffw0

Do you know where to find the total number of results for cpp/unused-static-function in the DCA run? Losing 236 is a lot, but I have a feeling its a noisy query in the first place.

I don't think this is possible out-of-the-box, no. There might be an option somewhere to tell it to generate the full list of alerts. We could ask in the DCA Slack channel about this if you think it'd be useful.

MathiasVP avatar Sep 22 '22 14:09 MathiasVP

I think it would be useful in cases like this, yes, but I'm tracking a million things today already. Feel free to talk to them.

geoffw0 avatar Sep 22 '22 15:09 geoffw0

DCA:

I'm ignoring the 5% overall increase in query run times, because this is a change to one QL file and it isn't very plausible its had an effect on so many other queries.

Looking at some of the results:

Kitware__CMake: Source/cmCPluginAPI.cxx:25:20:25:35 (represents 54 similar results)

  • a pointer to the function cmGetClientData is stored in an array cmStaticCAPI, but that array does not appear to be used anywhere. I believe this is probably a TP for the query, so an unfortunate loss for this change.
  • there are two ErrorExprs in the file but I believe (based on Locations) the entire file was extracted.

abseil-cpp-linux: absl/base/exception_safety_testing_test.cc:526:6:526:13

  • DummyOp is only referenced inside a decltype expression; if I'm understanding correctly that means the type of the function is used but the body is not, which is probably a TP for the query?
  • ErrorExprs thoughout the file, but I believe the entire file was extracted.

systemd__systemd: src/analyze/analyze-security.c:148:22:148:40

  • security_info_free is a macro arg to DEFINE_TRIVIAL_CLEANUP_FUNC which creates a function calling security_info_free, but it isn't clear whether that function is called anywhere.
  • one ErrorExpr and one Diagnostic, but I believe the entire file was extracted.

Overall, many of these cases are in tests (so we wouldn't normally report them) and in macro or template heavy code (so difficult to understand manually).

On MRVA I see the total number of results in 69 projects decrease from 779 to 416 (47% reduction), which seems fine for such a noisy query. Most cases we lose had both ErrorExprs and Diagnostics in the same file (unlike 2 of the 3 cases above), suggesting that something had gone more seriously wrong, though its still difficult to diagnose individual cases with any confidence. I get most of my confidence from the QL tests.

We could remove the part of the change that looks at ErrorExprs, to focus the change to the more severe cases (where we're likely to have successfully extracted less of the code)???

geoffw0 avatar Sep 22 '22 16:09 geoffw0

I've just done the change to just basing this on Diagnostics, not ErrorExprs.

Tests still pass. There are now 717 MRVA results on 69 repos (8% less than original). Does anyone want to see another DCA run?

geoffw0 avatar Sep 23 '22 14:09 geoffw0

DCA showed:

  • 107 fixed alerts for cpp/unused-static-function in systemd__systemd, 32 in zeek__spicy and 1 in microsoft__ChakraCore.
  • the 54 (dubious) changes to results on Kitware__CMake from the previous DCA run are gone; as are the 4 from boostorg__boost, 3 from openjdk__jdk and 3 from openttd__openttd.
  • sampling some of the fixed alerts (one from each affected project), I am pretty happy with the changes (though its difficult to see what is and isn't used by hand).
  • so I think the changes have reduced the number of "fixed alerts" greatly, but improved the ratio of them that I would deem to be "probably correct".

geoffw0 avatar Oct 17 '22 17:10 geoffw0

Merged. Thanks for reviewing!

geoffw0 avatar Oct 18 '22 13:10 geoffw0