server icon indicating copy to clipboard operation
server copied to clipboard

[MDEV-21978 part 2] Enable and Fix GCC `-Wformat` Checks on `my_vsnprintf` and Users

Open ParadoxV5 opened this issue 1 year ago • 1 comments

  • The Jira issue number for this PR is: MDEV-21978
  • Resolves https://github.com/MariaDB/server/pull/1531#issuecomment-626620929
  • Prerequisite (MDEV-21978 & GSoC part 1): #3309
    • This PR bases on that prerequisite PR and currently shows up inheriting those commits. I will keep this branch updated with that prerequisite branch. When that prerequisite merges, I will rebase this PR to the latest branch directly.
  • Google Summer of Code (GSoC) 2024 milestone 2
    • … And it even assigns me a responsibility I’ve rarely taken before – housekeeping. Everyone dreads chores, amirite?
    • My experience was also regular C, not C++, and I thought I wouldn’t need to deal with C++ for this project. :face_in_clouds:

Why wasn’t this project merged during GSoC?

(Yes, I’m linking this PR as my GSoC final submission.)

Sporadic (inconsistent) CI failures on the base branch have been consistently plaguing the MariaDB/server repo before and during this GSoC event. How do we like them :apple:s? These issues confuse both new (such as me and other GSoC participants) and old contributors, leading to wild geese chases and even alert fatigue.[^m]

[^m]: O. Kekäläinen et al., “Can we get MariaDB GitHub CI to consistently be green?” [Mailing list]. Available: https://lists.mariadb.org/hyperkitty/list/[email protected]/thread/NRMKHZ6JRDEWQ73HKV4XYVKHSEY7X3WF/

MariaDB organizers are aiming to improve coördination and address those bugs promptly. Unfortunately, this shift in focus puts new features such as most of the GSoC projects in the back seat. For more information, folks are tracking those failures at MDEV-33073 always green buildbot and #3425, and are discussing specific issues on the “Test failures” Zulip thread.

After GSoC, I will probably continue updating this PR and the prerequisite “part 1” PR, and also publish more related patches. For the record, this GSoC only owns the content in these two PRs that were created before 2024-08-26T18:00Z (GSoC final submission deadline), including PR descriptions, comments, and commits up to and including “Remove %`s %b %M %T”.

Description

This is the second part of MDEV-21978. It may be the lesser part, but the last 10% takes 90% of the time, they say.

The majority of this PR tags the my_snprintf service functions and all[^a] functions that use those functions (see § Release Notes for notable ones) with __attribute__((format(printf, …))) (via ATTRIBUTE_FORMAT). I committed them as multiple sets – LMK if you prefer my commits squashed.

[^a]: The only omission is static inline void wsrep_override_error(THD*, uint, const char*=, ...), whose default argument is inherently a NULL pointer error to -Wformat.

This attribute effectively enables GCC -Wformat checks on them. It therefore also pushes for (a lot of) format-related fixes to insignificant security vulnerabilities and possible vulnerabilities: (My GSoC mentor told me in Direct Message that they aren’t much of a problem and I can open PR(s) normally.) For some post-GSoC ketchup, I am currently also extracting (read: backporting) these corrections in subsets to older maintained version according to the bug fix process.

  1. 10.5: #3485
  2. 10.6: #3493
  3. 10.11: #3518
  4. 11.2: #3537
  5. 11.4: #3541
  6. 11.5: #3538
  • Passing content directly as the format string
    • I solved most of them with the simple "%s" fix. For push_warning_printf, I switched them to the non-printf equivalent push_warning.
  • Missing or extraneous arguments missed by refactorings
  • Passing our string structs to %s (imlicitly casting to char*)
  • Mixed use of size_t, fixed-size and platform-dependant integer types and size specifiers
    • -Wformat cannot catch these if their sizes happen to line up for the platform. See § non-goals under § More descriptions

The attribute also (conveniently) complained about all format string literals that used the pre-#3309, specifier-based and therefore -Wformat-incompatible extension syntaxes (%`s, %b, %M & %T). To be thorough, I’ve also done a manual grep at the end to identify the loose ends, most of which are error message( template)s from errmsg-utf8.txt, mysys/errors.c and sql-common/errmsg.c. I have updated them all to #3309 (the prerequisite)’s suffix-based, -Wformat-compatible extension syntaxes (%sQ, %sB, %iE & %sT respectively). I’ve updated all impacted tests as well, namely those involving these error message templates.

With all usages upgraded, I’ve finally deleted the old extension syntaxes, marking the completion of the MDEV-21978 transition.

I did not increase service_my_snprintf’s version, as I’ve already done so in the part 1 PR #3309, and I intended these two parts to be merged in the same MariaDB version. I did, though, bump the major versions of other __attribute__d services (logger and my_print_error) to remind that they too now respect -Wformat and switched up the extension syntax.

What problem is the patch trying to solve?

Development tools – namely GCC’s -Wformat – can check printf formats and arguments and catch mistakes. Howëver, my_vsnprintf and co., our in-house platform-agnostic replacement for the printf suite, are incompatible with these tools because of their custom extensions.

Before the prerequisite PR, our format extensions relied on specifiers that’re undefined in the C/C++ Standard and therefore were guaranteed false positives to these error detectors. This problem forced our my_sprintf service to stay away from these helpful features, which contributes to human errors remaining elusive and piling up like technical debt.

After I created those compatible alternatives in the prerequisite, I enabled -Wformat on my_snprintfs and descendants, originally intended to get my GCC builds to automatically locate (most of) those false-positive old constructs for migration (hence beïng the follow-up). Instead, those forgotten mistakes relentlessly haunted me like a freshly-excavated skeleton-in-our-closet.

Since the MariaDB recently shifted focus on squashing long-time bugs, I too decided to not abandon commit quality. I willingly stretched my GSoC project’s scope for this cause by including those dozens of corrections in this Part 2 PR. (As stated in § why GSoC wasn’t merged, we can’t merge GSoC project(s) in time anyhow because of priorities.)

In summary, this beginner-friendly project became much more impactful than I imagined. We not only fixed countless entries of elusive mistakes but are also future-proofing them from reöccuring.

More descriptions

If some output changed that is not visible in a test case, what was it looking like before the change and how it's looking with this patch applied?

Good question. Before fixing those mismatching argument sizes and counts, who knows what out-of-bounds stuff they’ve been reading? There’s also this super insignificant cosmetic fix.

Do you think this patch might introduce side-effects in other parts of the server?

Technically yes, but hopefully not. The old syntaxes have been obsolete and practically deprecated since Part 1, but were once features nonetheless. Although earlier commits should’ve migrated every usage (direct or indirect) in MariaDB/server, it is still theoretically possible that some usages are so stealthy they evaded my grep.

non-goals

  • Go out of my way to catch argument type mistakes that GCC -Wformat doesn’t complain
    • e.g.,
    • I can only lightly review manually on a best-effort basis – I’m not interested in tracking down types through layers of C++ class inheritance.
      • Just the samples I came across already uncovered many instances. Catching them all will be a laborious and intimidating project.
      • I only did an extra meter on sql/table.cc because the main process already heavily modified the file.
  • Tag other snprintf alternatives with ATTRIBUTE_FORMATand fix calls to them or the C/C++ snprintf itself
  • Port every (uncaught) “copy-paste” calls (those that simply clone the format string over and formats no args) to alternatives if available
    • I’ve divided those for my_snprintf to #3429. They will likely conflict with the old patch here; in which case, that PR’s strmakes take precedence.
  • Reformat all those old entire files to match the CODING_STANDARDS.md
    • This includes normalizing tabs to spaces.

Release Notes

checks part 1 notes amend part 1’s notes along the lines of:

  • my_snprintf extensions %`s, %b, %M and %T are removed. They are now %sQ, %sB, %iE and %sT respectively, which are fully compatible with printf checks such as GCC -Wformat.
    • This applies to both my_snprintf and all functions that utilize it, such as:
      service_my_snprintf
        my_snprintf
        my_vsnprintf
      service_logger
        logger_printf
        logger_vprintf
      service_my_print_error
        my_printf_error
        my_printv_error
      DBUG_PRINT
        _db_doprnt_
      push_warning_printf
      
      (Refer to the commit history for the complete listing)
    • On that note, all those functions also gained __attribute__((format(printf, …))) to enable said GCC -Wformat checks, and with that came countless fixes to garbled displays (if not outright crashing) of error messages and debug logs on certain platforms that were stealthed for years!

I searched on https://mariadb.com/kb/en/ and found no relevant pages.

How can this PR be tested?

Most of the changes leverage ATTRIBUTE_FORMAT which enables GCC -Wprintf (we also had -Werror enabled). ATTRIBUTE_FORMAT is currently empty for non-GCC builders.

What ATTRIBUTE_FORMAT cannot capture are the aforementioned error messages. I discovered that extra/comp_err.c checks for consistency in fields between localizations; but beyond this, I don’t know how the MariaDB team reviews applications of those error messages.

PR quality check

  • [x] This is a new feature or a refactoring, and the PR is based against the latest MariaDB development branch.
    • This PR bases on the prerequisite PR which itself is based on the latest development branch.
  • ~~This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.~~
  • [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • [x] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

ParadoxV5 avatar Jun 26 '24 03:06 ParadoxV5