cryptopp icon indicating copy to clipboard operation
cryptopp copied to clipboard

Fix ESIGN memcpy null pointer and clang22 fortify warning (#1338)

Open scc-tw opened this issue 2 months ago • 10 comments

  • Fixing the immediate UBSan bug (null pointer to memcpy when the size is 0)
  • Quieting the Clang/Fortify warning at InvertibleESIGNFunction::GenerateRandom(RandomNumberGenerator &rng, const NameValuePairs &param)

Fixes #1338

scc-tw avatar Nov 13 '25 10:11 scc-tw

Thanks for submitting this upstream! I'm glad the fix I provided in #1338 resolved your issue. The .data() approach correctly preserves bounds information for Fortify and Clang's static analyzers while maintaining the same behavior.

Coralesoft avatar Nov 13 '25 10:11 Coralesoft

This PR essentially does not fix the issue, because zero count in memcpy never was a trouble. The problem is ELEM_MAX being larger than clang allows.

irwir avatar Nov 13 '25 16:11 irwir

Hi, thanks for taking the time to look at this.

Just to clarify what I was trying to address: the warning I’m looking at is raised on the memcpy in InvertibleESIGNFunction::GenerateRandom:

std::memcpy(seed + 4, seedParam.begin(), seedParam.size());

The diagnostic shows a bound of 18446744073709551612 (0xFFFFFFFFFFFFFFFC, i.e. -4 as size_t), which suggests the static analysis has lost track of the relationship between the destination buffer and the size being copied once pointer arithmetic on seed + 4 is involved.

In my fork (cryptopp-modern) I changed this to keep the behaviour the same but make that relationship explicit:

seed.resize(seedParam.size() + 4);
CRYPTOPP_ASSERT(seed.size() >= seedParam.size() + 4);
std::memcpy(seed.data() + 4, seedParam.begin(), seedParam.size());

Using seed.data() + 4 instead of seed + 4, together with the CRYPTOPP_ASSERT, gives Fortify/Clang a clearer model of the destination bounds. In my builds (GCC 14.1 as in #1275, and clang22 with fortified headers as in #1338), that’s enough for the false-positive overflow warning on this call site to disappear, without changing runtime behaviour.

I’m not claiming this fixes any broader issues around ELEM_MAX or clang’s global limits – just that it resolves this specific static-analysis warning in GenerateRandom for the configurations above. If you’re still seeing the warning with this pattern, I’m happy to compare configs or dig further.

For reference, the fork and change are here:

  • Repo: https://github.com/Coralesoft/cryptopp-modern
  • Commit: https://github.com/Coralesoft/cryptopp-modern/commit/79ecd1f0

Coralesoft avatar Nov 13 '25 19:11 Coralesoft

@irwir Hello, and thank you for taking the time to review my PR!

First, the PR description itself is a bit short because I tried to follow the contribution guidelines from the official Crypto++ website (https://www.cryptopp.com/ ). For that reason, I also sent an email to the mailing list with the full details.

Here is the complete content:

Issue Summary

  • Primary Issue: UBSan detects null pointer passed to memcpy in esign.cpp:118 when an empty seed (size=0) is provided to InvertibleESIGNFunction::GenerateRandom.
  • Secondary Issue: This fix also resolves Issue #1338 (clang22 fortify warning) by using .data() and an explicit bounds assertion.

Environment

  • Crypto++ Version: Current master (commit 60f81a77)
  • Operating System: Ubuntu (GitHub Actions: ubuntu-latest)
  • Compiler: Clang (default) and Clang 22
  • Build Flags: -fsanitize=address,undefined for sanitizer testing; -O3 -D_FORTIFY_SOURCE=2 for fortify validation

Exact Error Message esign.cpp:118:25: runtime error: null pointer passed as argument 2, which is declared to never be null /usr/include/string.h:44:28: note: nonnull attribute specified here

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior esign.cpp:118:25

Stack Trace #0 in CryptoPP::InvertibleESIGNFunction::GenerateRandom(...) /home/runner/work/cryptopp/cryptopp/esign.cpp:118:3 #1 in run_one(unsigned long, unsigned int) /home/runner/work/cryptopp/cryptopp/tests/esign_memcpy_asan.cpp:30:9 #2 in main /home/runner/work/cryptopp/cryptopp/tests/esign_memcpy_asan.cpp:47:17

Root Cause When seedParam.size() == 0, seedParam.begin() may return a null pointer, which violates the C++ standard requirement that memcpy must receive non-null pointers even when the copy size is 0.

Minimal Reproduction

A focused test exercising this path is available here: https://github.com/scc-tw/cryptopp/blob/asan-esign-test/tests/esign_memcpy_asan.cpp

The test validates 17 seed lengths (including 0) across 3 modulus sizes for comprehensive edge case coverage.

Evidence

Commit demonstrating the issue:

  • https://github.com/scc-tw/cryptopp/commit/f47507e6beb94c97cdda84f0d0d06f4299cc51d9
  • CI failure: https://github.com/scc-tw/cryptopp/actions/runs/19327409311/job/55281830058

Commit with UBSan fix:

  • https://github.com/scc-tw/cryptopp/commit/db3951aa00258c2789c2eab519b312de6157f948
  • CI success: https://github.com/scc-tw/cryptopp/actions/runs/19327600421/job/55282868767

Complete fix (UBSan + Issue #1338):

  • https://github.com/scc-tw/cryptopp/commit/ab1878dde7658ff8f18482d9944f6996bf3155a6
  • CI validation: https://github.com/scc-tw/cryptopp/actions/runs/19327831662/job/55283667912

Pull Request #1340 I've opened Pull Request #1340 with this fix: https://github.com/weidai11/cryptopp/pull/1340

The fix applies three changes to esign.cpp:117-122:

seed.resize(seedParam.size() + 4);
// Help static analyzer verify bounds (Issue #1338)
CRYPTOPP_ASSERT(seed.size() >= seedParam.size() + 4);
// Guard against null pointer when size is 0 (UBSan)
if (seedParam.size() > 0)
    std::memcpy(seed.data() + 4, seedParam.begin(), seedParam.size());

Key changes:

  1. Use seed.data() instead of seed + 4: Clearer for static analyzers, resolves clang22 fortify warning
  2. Add CRYPTOPP_ASSERT: Explicit bounds verification helps compiler verify safety
  3. Guard with if (size > 0): Prevents null pointer UB, handles empty seed edge case

This approach is inspired by the cryptopp-modern fork fix (commit 79ecd1f0) with additional UBSan protection.


Additional Context (referencing @Coralesoft explanation)

I think @Coralesoft already described the fortify/Clang side of this very clearly in his earlier comment, so I’ll just briefly summarize and connect it to this PR.

In short, the warning we’re seeing is raised on the memcpy in InvertibleESIGNFunction::GenerateRandom:

std::memcpy(seed + 4, seedParam.begin(), seedParam.size());

As @Coralesoft pointed out, the static analysis appears to lose track of the actual buffer bounds once pointer arithmetic (seed + 4) is involved. In his cryptopp-modern fork, he keeps the runtime behavior the same but makes the bounds relationship explicit by:

  • Resizing seed based on seedParam.size() + 4,
  • Adding a CRYPTOPP_ASSERT that encodes the bound,
  • Using seed.data() + 4 instead of seed + 4.

This pattern is exactly what I adopted here for the fortify/Clang diagnostic: it doesn’t change the semantics of GenerateRandom, but it gives the compiler and fortified headers a clearer view of the destination buffer, which is enough to silence the false-positive overflow warning in the configurations we’ve tested (including clang22 with fortified headers).

On top of that, my PR adds the if (seedParam.size() > 0) guard. That part is specifically for UBSan: it avoids calling memcpy with a potentially null pointer when the size is 0, which is what UBSan is currently flagging because of the nonnull attribute on the system memcpy. This does not try to solve any broader ELEM_MAX issues, but it does remove the concrete UBSan runtime error we’re observing, while keeping the logic of GenerateRandom unchanged.

scc-tw avatar Nov 14 '25 06:11 scc-tw

  • Use seed.data() instead of pointer arithmetic for static analyzer

seed.data() is a pointer, hence seed.data()+4 is still pointer arithmetics.

  • Add CRYPTOPP_ASSERT to verify bounds explicitly

There was NDEBUG defined, which is for non-debug builds. Then CRYPTOPP_ASSERT becomes a no-op (void)0. It is unclear how this could impove static analysis, unless it treats whatever is written in asserts as the truth (which would be odd).

  • Guard memcpy against null pointer when seedParam.size() == 0

Seem like this check should have been done much earlier; and maybe exception should have been thrown instead of returning random(?) values without a proper seed.

Possibly the root cause should be fixed somewhere in secblock.h in the definition of the maximum size in SecBlock class, because now it seems to be defined as UNIT64_MAX while malloc (used in resizing) allows only half of the value.

irwir avatar Nov 14 '25 18:11 irwir

@irwir, thanks for the thorough review. You’ve raised valid technical points:

Re: pointer arithmetic, you’re correct that seed.data() + 4 is still pointer arithmetic. The benefit is that using an explicit .data() pointer helps Clang/Fortify track buffer bounds in static analysis, which reduces false positives in practice. The runtime semantics are identical, but the tooling behaviour improves.

Re: CRYPTOPP_ASSERT, agreed it’s a no-op in release builds when NDEBUG is defined (trap.h:101). In practice, the runtime guard in this code path is the check in esign.cpp (around line 114):

if (seedParam.size() > seed.ELEMS_MAX - 4)
    throw InvalidArgument(...);

The assert is there as supplementary debug-time validation.

Re: ELEMS_MAX, this is your strongest point. ELEMS_MAX = SIZE_MAX/sizeof(T) can exceed malloc’s practical limits (often around PTRDIFF_MAX on most systems). While the guard above prevents issues in this particular code path, you’re right that the underlying concern lives in secblock.h rather than in this function.

Proposal, my view is that @scc-tw's PR still has value on its own by:

  • Fixing the immediate UBSan bug (null pointer to memcpy when the size is 0), and
  • Quieting the Clang/Fortify warning at this call site.

I’d suggest we treat this as a targeted fix and track the ELEMS_MAX architecture question separately, if you think it’s worth pursuing, it might make sense to open a follow-up issue for the ELEMS_MAX concern you’ve identified in secblock.h. I’d be happy to contribute to the discussion.

Thoughts?

Coralesoft avatar Nov 14 '25 20:11 Coralesoft

@irwir Thanks a lot for your detailed review comments and for taking the time to write them up.

To be honest, the original wording in this PR description partly came from autocomplete / GPT. The only part I carefully reviewed myself was the email-style explanation I pasted into the PR body. I was trying to follow the project’s contribution guidelines more rigorously and treated the short PR summary as relatively trivial, since the patch itself is small and (in my view) quite safe. I agree with you that the way I phrased it was not ideal and could be misleading, and I appreciate you pointing that out.

Many thanks also to @Coralesoft for the clearer explanation—I’ve updated the wording in this PR to follow that summary, because it matches what I actually had in mind much better.

I also agree that ELEMS_MAX looks like a separate, broader issue. It probably deserves its own GitHub issue so we can discuss and design a proper solution there. Your comments here are very helpful input for that discussion.

Please let me know if the current patch and the discussion so far address your concerns, or if there’s anything else you’d like me to clarify or adjust. I’m happy to provide more details or refine the change further.

scc-tw avatar Nov 15 '25 13:11 scc-tw

@irwir Thank you again for your thoughtful comments.

I agree that the ELEMS_MAX topic and realistic allocation limits belong to a broader design discussion around SecBlock, and are separate from this localized UBSan/Fortify issue. This PR is only meant to address the concrete UBSan runtime error and the Fortify/Clang diagnostic at this specific call site, without changing the behavior of GenerateRandom for non-empty seeds.

To properly address the ELEMS_MAX concern, I will open a separate issue so we can track it and work toward a clean, general solution there. If you have any suggestions for that discussion, I would really appreciate your input.

scc-tw avatar Nov 15 '25 13:11 scc-tw

The benefit is that using an explicit .data() pointer helps Clang/Fortify track buffer bounds in static analysis, which reduces false positives in practice.

Exactly the opposite is true. Abstract pointer replaced direct object access, analyser lost track and could not check the size.

In practice, the runtime guard in this code path is the check in esign.cpp (around line 114):

The guard itself is buggy. On the other hand, PTRDIFF_MAX allocations are unreasonable in 32 bits, even more so for 64 bits. An assert for both zero count and fixed ELEM_MAX might be considered instead of explicit runtime checks.

Thoughts?

Nothing new. The changes should fix diagnosed issues, not just hide. For hiding there are other options - without touching the source code. False positive could be ignored; the same for spurious extra warnings generated due to additional compiler flags, like /Wall in Visual Studio.

irwir avatar Nov 17 '25 10:11 irwir

@noloader Hi, could you take a look at this when you have a moment?

This PR fixes:

  • a UBSan null-pointer memcpy in InvertibleESIGNFunction::GenerateRandom for the empty-seed case, and
  • the clang22 Fortify warning tracked in #1338,

with the .data() + CRYPTOPP_ASSERT pattern already used in cryptopp-modern.

There’s been some discussion above about whether this should stay a narrow, call-site fix vs. addressing broader ELEM_MAX/SecBlock limits. I’d really appreciate a maintainer decision on whether a scoped fix like this is acceptable as-is, or if you’d prefer it reworked / moved to a more fundamental change. Happy to adjust the patch either way.

scc-tw avatar Nov 17 '25 18:11 scc-tw