celeritas Codecov seems to report false positives

During recent MR reviews I observed two Codecov issues related to macros:

CELER_DISCARD and CELER_NOT IMPLEMENTED: SolidConverter.cc lines 815 and 816 (from https://github.com/celeritas-project/celeritas/pull/2032)

EXPECT_EQ: IntersectRegion.test.cc line 2527 (from https://github.com/celeritas-project/celeritas/pull/2032)

Oct 13 '25 18:10 elliottbiondo

As seen/hinted here: https://app.codecov.io/gh/celeritas-project/celeritas/pull/2043?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=celeritas-project

It seems that the issue might be triggered by Real3{0, 0, 0} as the lines is counted correct for Real3{0.1, 0.1, 0.1}

Oct 13 '25 19:10 pcanal

Weird, so maybe it's some sort of compiler optimization? 😕

Oct 13 '25 21:10 sethrj

Maybe whatever CI job runs Codecov could be down without compiler optimizations. That would eliminate having to decorate the code with pragmas for loops that may be unrolled, for example.

Oct 13 '25 21:10 elliottbiondo

Here's another false positive in #2048: https://app.codecov.io/gh/celeritas-project/celeritas/pull/2048

I know every loop here and conditional is being tested in PolySolid.test.cc. And what's up with the "missing" line of coverage?

@pcanal Is there a tolerance parameter that lets us assume codecov will have a few false positives per PR so that it doesn't get a nasty "❌" marker?

Oct 14 '25 20:10 sethrj

Is there a tolerance parameter that lets us assume codecov will have a few false positives per PR so that it doesn't get a nasty "❌" marker?

Hide it from 'which' part of the interface? On specific lines, I suppose the comment is the right direction. On the codecov web site, a percentage wouldn't help since it has the marker next to each line ...

Oct 14 '25 20:10 pcanal

We definitely do not have the time and overhead to wait for code coverage to give us back a bunch of false positive and then add ugly markup to the code to hide them… that could be reasonable if it's one in every thousand lines, but it seems to be one in every 50 😢

Oct 14 '25 21:10 sethrj

Explanation of: EXPECT_EQ: IntersectRegion.test.cc line 2527 (from https://github.com/celeritas-project/celeritas/pull/2032)

The report is 'technically' sort of correct (albeit not that relevant). We have some use of EXPEC_EQ on one line:

    EXPECT_EQ(SignedSense::inside, this->calc_sense(result.node_id, Real3{0, 0, 0}));

and some time in multiple lines:

    EXPECT_EQ(SignedSense::inside,
              this->calc_sense(result.node_id, Real3{0, 0, 0.1}));

which epands to:

# 2538 "/home/pcanal/geant/sources/celeritas/test/orange/orangeinp/IntersectRegion.test.cc" 3 4
...
# 2538 "/home/pcanal/geant/sources/celeritas/test/orange/orangeinp/IntersectRegion.test.cc" 3 4
   ))) ; else ::testing::internal::AssertHelper(::testing::TestPartResult::kNonFatalFailure, "/home/pcanal/geant/sources/celeritas/test/orange/orangeinp/IntersectRegion.test.cc", 253
8, gtest_ar.failure_message()) = ::testing::Message()
                                                                 
# 2539 "/home/pcanal/geant/sources/celeritas/test/orange/orangeinp/IntersectRegion.test.cc"
                                                                ;

where the else clause is indeed not exercised ... and the macro being split across multi line make the result different and (possibly due to the expansion that shuffles things around) make the reporting 'murky'.

I am investigating ways to mitigate this.

Oct 15 '25 19:10 pcanal

Aha, very sneaky! Good diagnosis.

Oct 15 '25 20:10 sethrj

Are there flags that can be added to silence all exception code at once, e.g., https://app.codecov.io/gh/celeritas-project/celeritas/blob/develop/src%2Fcorecel%2Fdata%2Fdetail%2FPinnedAllocatorImpl.cc ; or a way to have "don't test this line" flags propagate through macros such as CELER_NOT_IMPLEMENTED; or do we just have to feed the "untested lines" result into an AI bot that adds the appropriate markers and submits a PR? 😬

Oct 16 '25 00:10 sethrj

@pcanal Here's what I really want to avoid:

This PR (#2051) reports itself as having failed. Even though codecov is not one of the "required" checks, this failure obscures the readiness of the PR and the validity after merging, because github doesn't differentiate markers between "actual failure" and "allowed failure". That makes it harder to tell at a glance what PRs are in a state ready for reviewing.

We can't afford to go through every PR after codecov has run and add "GCOV" markers to EXPECT_EQ conditions that happen to wrap to two lines. I can't find any documentation that would let us auto-add markers or annotate from inside a macro rather than where the macro is used. Ideally we wouldn't have to drop coverage of test/ because it's good to find dead code there.

Thoughts?

Oct 17 '25 12:10 sethrj

We can't afford to go through every PR after codecov has run and add "GCOV" markers to EXPECT_EQ conditions that happen to wrap to two lines.

Agreed.

Ideally we wouldn't have to drop coverage of test/ because it's good to find dead code there.

The code in test is structurally not fully tested since it is basically:

auto res = run_something();
If (res != expected_state)
   handle_failure();

So until/unless there is a way to easily/automatically annoated (eg. we can not annotate EXPECT_EQ per se since it is inside a 3rd party library).

So for now, I think we can skip the reporting about the test (and possible app) directory.

Oct 21 '25 15:10 pcanal

Tagging #2093

Nov 03 '25 12:11 sethrj