rust-bindgen icon indicating copy to clipboard operation
rust-bindgen copied to clipboard

Keep radix for integer literals in generated bindings

Open miikkas opened this issue 6 months ago • 9 comments

The radices of integer literals in the input code, including both C and C++, are retained on a best-effort basis in the generated bindings.

  • The radix of a C/C++ integer literal in constant, enum, or macro definition is parsed. If it can't be determined, the code falls back to not doing anything about the radix.
    • This can be currently optionally enabled with Builder (.keep_integer_radices(true)) or CLI (--keep-integer-radices) switches. For now, the default is to have the feature off.
  • During Rust code generation, the radix of an integer literal – if it has been determined – is then used to output that integer's value represented in that number base.
  • New testing is added in the form of unit tests for parsing the radix of a Clang CXToken_Literal token and for outputting the tokens during codegen, and also as a new test header file paired with its matching expected Rust binding output.
  • Existing tests with hexadecimal or octal integers are updated to keep the corresponding radices of those values.

Resolves #3236.

miikkas avatar Jun 18 '25 17:06 miikkas

I have now included additional testing to the PR.

Regarding the 13 failing existing tests: It seems that the differences are due to bindgen now generating, as intended, Rust literals using hexadecimal representation, while the values themselves have correctly stayed the same. ~Should I fix the expected code accordingly?~

edit: I have adjusted the expected code accordingly.

miikkas avatar Jun 19 '25 17:06 miikkas

The test I added was changed to use C++14 instead of C23, as it seems that the latter is not available for clang 9.0, which IIUC is used in the CI.

I have changed the expected side of the previously failing existing tests. Those changes are currently included in a separate commit in case that makes review less cumbersome. ~I will squash it later as instructed in the contribution guidelines.~

edit: The commits have been squashed.

miikkas avatar Jun 21 '25 18:06 miikkas

I have made the solution a lot more robust and included further unit testing. In my view, this is ready for review.

r? @emilio @pvdrz

miikkas avatar Jul 06 '25 13:07 miikkas

Error: The feature assign is not enabled in this repository. To enable it add its section in the triagebot.toml in the root of the repository.

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.

rustbot avatar Jul 06 '25 13:07 rustbot

To mitigate possible churn and not force the feature on all users immediately, I've added an option defaulting to off as per https://github.com/rust-lang/rust-bindgen/issues/3236#issuecomment-3064790027.

miikkas avatar Jul 15 '25 17:07 miikkas

Error: Parsing assign command in comment failed: ...'' | error: specify user to assign to at >| ''...

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.

rustbot avatar Aug 08 '25 13:08 rustbot

Error: The feature assign is not enabled in this repository. To enable it add its section in the triagebot.toml in the root of the repository.

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.

rustbot avatar Aug 08 '25 13:08 rustbot

The latest change removed the varying case long long suffixes (i.e. lL and Ll) from the INTEGER_SUFFIXES array, since those are not permitted by the C standard nor are they supported by clang or gcc. Additionally, I adjusted the length of some documentation rows that exceeded rustfmt's configured maximum of 80, and I added some testing for shorts and chars as well in the new .hpp test file.

miikkas avatar Aug 17 '25 10:08 miikkas

The commits have now been squashed into one commit that should stand on its own.

miikkas avatar Aug 17 '25 11:08 miikkas