server
server copied to clipboard
MDEV-21978 make my_vsnprintf to use gcc-compatible format extensions
- The Jira issue number for this PR is: MDEV-21978
- Google Summer of Code (GSoC) 2024 milestone 1
- Despite I’m not a beginner at open-source coding, this GSoC still manages to stretch my comfort zone :chart_with_upwards_trend:. Workflow, code style, and even the organization size for starters are all new to me.
- GSoC milestone 2 & project status: #3360
- Achievement: Discovered that emojis break Buildbot. :sweat_smile:
Description
This is the first but primary part of MDEV-21978.
It adds the following suffix-based (MariaDB) alternatives to my_vsnprintf’s MySQL extensions:
| Old | New | Brief Description |
|---|---|---|
%`s |
%sQ[^I] |
quote with ` |
%b |
%sB |
fill with the binary sequence without terminating on \0 |
%T |
%sT |
use ... for any truncation |
%sS |
opt-in escapes for any conflicts with the above | |
%M |
%iE[^u] |
print this errno and its corresponding strerror() |
(%dE has no effect and thus serves as %iE’s escape.) |
[^I]: Zulip discussion agreed to change this from JIRA’s %sI
[^u]: Zulip discussion suggested to change this from JIRA’s %uE.
%d and %i were synonyms in plain C/C++ printf. This PR’s design utilizes that (but breaks it) to skip %iI and in turn %dE. This design reduces the impact of the suffix design (see § side-effects) by leaving the more popular %d alone. %i solos %iE here and, if we make more extensions in the future, will solo those as well.
OTOH, no other format specifiers have a less-popular synonym, so %s (plus any specifiers extended in the future) gets an anti-conflict escape %sS[^X].
[^X]: JIRA author @vuvova designed this from SQL select '''': https://mariadb.zulipchat.com/#narrow/stream/118759-general/topic/MDEV-21978.20make.20my_vsnprintf.20to.20use.20gcc-compatible.20format/near/438173635
Note: Unlike what the JIRA requires:
All error messages and all usages of
my_vsnprintfshould be changed to use the new syntax. One way to do it is to disable old syntax conditonally[sic], only in debug builds. All gccprintfformat checks should be enabled.
This PR will only migrate usages immediately affected by the addition of suffixes. To divide the project for reviewers’ convenience (and to mark Google Summer of Code milestones), I have divided the following housekeeping to a separate follow-up PR at #3360 (MDEV-21978 & GSoC part 2):
- (Re)activate GCC
printfchecks - Utilize those GCC checks to port
my_vsnprintfusages to the new preferred syntax - Delete the old, practically deprecated extension syntaxes (e.g.,
%M) becauseprintfchecks are sure to compilain their tests.- To help keep the above housekeeping chores a sequential process, I’ve left the old format intact for now, although only on a best-effort basis.
What problem is the patch trying to solve?
Development tools such as GCC can check printf formats and arguments and catch mistakes.
Howëver, our format extensions inherited from MySQL are incompatible with these tools – as listed above, they utilized specifiers that’re undefined in the C Standard.
Porting them to Standard C vsnprintf would lose the platform consistency of our in-house substitute.
Therefore, the middle ground is to instead (eventually) switch to a standard-compatible set for those extensions. The JIRA says that the Linux kernel has already gone this route.
Do you think this patch might introduce side-effects in other parts of the server?
Potentially.
The introduction of suffixes makes %s and %i start consuming certain successive chars (one of QBTSE) which may’ve been originally intended for literal printing. As such, in addition to those extension alternatives, the design provides %sS and %d as synonyms for %s and %i for escaping.
For example, Data Class: %sType could now result in Data Class: Structype. It now needs an escape: Data Class: %sSType (or better, a space).
In recognition of this (potential) uh regression?, I’ve bumped the snprintf service’s major version.
The CI tests apparently found no mistakes, though. If there’re affected usages, I will iron them out during the rest of the GSoC while I track printf usages to migrate to the new syntax.
non-goal
my_vsnprintf.c’s code seems to have some age. Howëver, intensive refactoring for optimization is outside of this ticket’s scope.
Non-exhaustive list of possible improvements (in no particular order)
- There may be some repeated logic in
my_vsnprintf_exandprocess_argwe could extract to additional helper functions. unionconstructs reüse the same RAM space for any argument type.- Input-tracking variables could be in the loops’ scope rather than (file-)global.
- We could repurpose
process_str_arg’slength_argrather than simply copying it tolength. process_str_arg’sif (nice_cut)has some mergeable code.- Besides the one in this PR, there are more
if-elsechains inprocess_int_argandmy_vsnprintf_exthat could instead useswitch-case. process_str_arg’sif (left_fill)currently doesn’t cover%sQand doesn’t have anelsesection.0is a flag, not a part of the field length… it shouldn’t beget_length’s concern…process_argcould do some preprocessing to minimize duplicated code.
Release Notes
checks existing notes along the lines of:
my_snprintf_serviceextensions%`s,%b,%Mand%Tare now%sQ,%sB,%iEand%sTrespectively, which are fully compatible withprintfchecks such as GCC-Wformat.- There is also a new escape
%sSsynonymous to%s. %dEisn’t added; use it to escape%iE.- This applies to both
my_snprintfand all other service functions that utilize it.
- There is also a new escape
The snprintf family is an internal component, and at most exported to plugins.
FWIW, this PR alone should bring no user-visible changes, and code quality brags can wait ’til Part 2.
I have updated the (only?) documentation in service_my_snprintf.h. I have searched printf on https://mariadb.com/kb/en/ and found no relevant pages.
How can this PR be tested?
We can test the new syntax with the same strategy as the old.
In fact, the new tests I’ve included in this PR for the new syntax, I crafted them by modifying a copy-paste of existing my_vsnprintf-t.c lines.
Note that these tests are no more comprehensive as the existing originals (which, uh, weren’t very throughout, unfortunately).
I’ve additionally tailored a semi-unique test to confirm that %sB ignores \0-termination.
PR quality check
- [x] This is a new feature or a refactoring, and the PR is based against the latest MariaDB development branch.
- In order to fit this new feature, I had to also do a wee bit of refactoring so the implementation fit inside existing
%sand%ibranches.
- In order to fit this new feature, I had to also do a wee bit of refactoring so the implementation fit inside existing
- ~~This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.~~
- [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
- [x] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.