7903933: Move sharable items from different generations to a common file
Please review this patch to move the C_* layouts and the static utility methods into separate classes: LayoutUtils.java and FFMUtils.java, respectively.
- The names could later be personalized through a JSON configuration.
- We can use static imports if the
-toption is no used and the files are generated into the default package, in that case we use the classname to call the static methods or use theC_*constants.
Some tests had to be modified slightly, either by adding new static imports or replacing classnames.
Progress
- [x] Change must not contain extraneous whitespace
- [x] Change must be properly reviewed (no review required)
Issue
- CODETOOLS-7903933: Move sharable items from different generations to a common file (Enhancement - P4)
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jextract.git pull/278/head:pull/278
$ git checkout pull/278
Update a local copy of the PR:
$ git checkout pull/278
$ git pull https://git.openjdk.org/jextract.git pull/278/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 278
View PR using the GUI difftool:
$ git pr show -t 278
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jextract/pull/278.diff
Using Webrev
:wave: Welcome back nbenalla! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
@nizarbenalla This change now passes all automated pre-integration checks.
After integration, the commit message for the final commit will be:
7903933: Move sharable items from different generations to a common file
Reviewed-by: mcimadamore, jvernee
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.
At the time when this comment was updated there had been 1 new commit pushed to the master branch:
- ec585cccaa131f429f77df3620325fe465c69ee0: 7903877: jextract exception handling in downcall wrappers
Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@mcimadamore, @JornVernee) but any other Committer may sponsor as well.
➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).
Webrevs
- 09: Full - Incremental (5733943f)
- 08: Full - Incremental (167423e1)
- 07: Full - Incremental (e3b9ae2a)
- 06: Full - Incremental (5c8802a6)
- 05: Full - Incremental (532ee1a6)
- 04: Full (01c8205f)
- 03: Full - Incremental (69d46473)
- 02: Full - Incremental (736ccfba)
- 01: Full - Incremental (fa41fe8b)
- 00: Full (583bcf61)
@nizarbenalla This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
I wasn't happy with the previous approach/implementation and started over from scratch, also did some small cleanup (the newly added option for framworks should use -- rather than -).
Shared items from multiple generation will only be moved to a class if the user specifies the --sharable-items option, with the class name of his choosing and the main header file extends that class. This will be helpful for project that need to run jextract on multiple headers, especially if those headers are large.
Adding an option to move shared items in a separate class feels like taking a shortcut. Let's look what's inside this bag of shared items (it used to be much bigger, so good news there):
- library arena
- trace downcall support
- alignment method
- findOrThrow
- upcallHandle
- primitive layout constants
I believe (1) should probably not belong in a separate class from the headers. The way I see it is that it should belong where the library lookups belongs to. Even if you run multiple extractions it is less clear to me as to whether there should be a single lifetime for all the different libraries or not (and, eventually, we want aspects such as this to be customizable via subclassing).
(2) seems a very useful debug option, and I think we should think about making it more official -- for instance with a linker option.
(3) is a general purpose method that would make sense to feature in the main MemoryLayout API.
(4) seems to be superseded by a similar method that was added in Java 23: https://docs.oracle.com/en/java/javase/23/docs//api/java.base/java/lang/foreign/SymbolLookup.html#findOrThrow(java.lang.String)
(5) seems like it could be inlined in the functional interface classes (it's just an helper to allow clients to create an upcall stub w/o the need to do a MethodHandle lookup -- which needs a try/catch).
Then there's (6). The first reaction I got was: well, whether primitive layouts should be emitted or not seems like another filtering decision (e.g. let's add some options to filter these out). Except this would not work -- it's not just about filtering -- it's also about telling every other file where to find the layouts for such primitive types. If they are defined somewhere else, then jextract need to know where to find them (e.g. if they are referenced by some other layout jextract is building). Which seems a similar problem as the one this PR is trying to solve anyway.
Stepping back -- I think if we look back, jetxract used to generate a RuntimeHelper class, and then everything else. While we moved away from that generation scheme (now we just emit header classes), the shared functionalities are still there. Perhaps a move in a good direction would be to:
- put all the shared functionalities in the root of the header hierarchy -- and don't put anything else in there
- this means that there will always be at least two header classes generated
foo_fandfoo_h$0, wherefoo_h extends foo_h$0and all the shared symbols are infoo_h$0. - add an option to override the name of that root header class
This is similar, in spirit, to what you have here, but with the advantage that there's only one generation scheme, not two -- e.g. the superclass with the shared symbol is always there -- only sometimes it can have a different name (because the user said so). Whether we want that default superclass name to be foo_h$0, or maybe something more explicit like foo_h_shared, I'm open to suggestions.
But I do think that we should try and make that shared class as small as possible -- most of the functionality there seems like it could belong in the main FFM API, and it would provide value even for clients not running on top of jextract.
Then there's (6). The first reaction I got was: well, whether primitive layouts should be emitted or not seems like another filtering decision (e.g. let's add some options to filter these out). Except this would not work -- it's not just about filtering -- it's also about telling every other file where to find the layouts for such primitive types. If they are defined somewhere else, then jextract need to know where to find them (e.g. if they are referenced by some other layout jextract is building). Which seems a similar problem as the one this PR is trying to solve anyway.
Note/history: while tempting, we can't really put (6) inside the main FFM API. We have thought about this for a long time -- the issue is that the types of the primitive layouts is not guaranteed to be stable. E.g. C_LONG might be either a ValueLayout.OfLong or a ValueLayout.OfInt, depending on the platforms. Other layouts, such as C_LONG_DOUBLE might only be available on some platforms and not others. This is why, long ago, we decided that the main FFM API should not concern with providing layout constants for C types -- as the set of such constant is not stable. Instead, such primitive C layouts can be "discovered" using the Linker::canonicalLayouts API.
Then there's (6). The first reaction I got was: well, whether primitive layouts should be emitted or not seems like another filtering decision (e.g. let's add some options to filter these out). Except this would not work -- it's not just about filtering -- it's also about telling every other file where to find the layouts for such primitive types. If they are defined somewhere else, then jextract need to know where to find them (e.g. if they are referenced by some other layout jextract is building). Which seems a similar problem as the one this PR is trying to solve anyway.
Also on this topic: for now we're mostly concerned about different extractions not repeating the code for primitive layouts and helper functions. This feels more like a "tip of the iceberg" kind of situation. For instance, you might have two libraries A and B, which both include the header of some third library C. Maybe you want to extract C separately, and then extract A and B so that they somehow magically point at the extracted bindings for C. Now sharing would be not just about primitive types, but about functions, structs and much more.
At the same time, going down this path can be very complex: A and B might pull in slightly different versions of C, or use some #define macro directives which would alter the shape of the generated bindings in C. In which case reusing the same bindings for C would be more difficult.
So, hidden somewhere in here there's a theme of: how do we move jextract to go from a per-extraction set of bindings to a multi-extraction friendly model. And going down this path will likely, I think, result in opening a big and complicated can of worms. I'm not saying we'll never get there -- but there's a limit with what we can express with simple command line options.
this means that there will always be at least two header classes generated foo_f and foo_h$0, where foo_h extends foo_h$0 and all the shared symbols are in foo_h$0.
Should we just give the base header a common name then, so that if you generate multiple times, you get sharing automatically? Maybe it could be named something like Builtins.
this means that there will always be at least two header classes generated foo_f and foo_h$0, where foo_h extends foo_h$0 and all the shared symbols are in foo_h$0.
Should we just give the base header a common name then, so that if you generate multiple times, you get sharing automatically? Maybe it could be named something like
Builtins.
Yes, see
This is similar, in spirit, to what you have here, but with the advantage that there's only one generation scheme, not two -- e.g. the superclass with the shared symbol is always there -- only sometimes it can have a different name (because the user said so). Whether we want that default superclass name to be foo_h$0, or maybe something more explicit like foo_h_shared, I'm open to suggestions.
(4) findOrThrow (4) seems to be superseded by a similar method that was added in Java 23: https://docs.oracle.com/en/java/javase/23/docs//api/java.base/java/lang/foreign/SymbolLookup.html#findOrThrow(java.lang.String)
I meant to remove this when targeting jdk 23, it seems I left it by mistake. I will remove it to reduce the number of shared items we're dealing with.
I've moved the shared symbols to a different class foo_h$shared that all headers extend from.
The new command line option was renamed to --shared-symbolsto allow users to specify the name of the class if they want to.
The option for changing header class name is called --header-class-name <name>. So I'd suggest something like --symbols-class-name.
I pushed a small test-only update, I was seeing some failures in the CI (that I didn't see locally). This fixes it,
Thanks for all the rounds of reviews.
/integrate
@nizarbenalla Your change (at version e3b9ae2a21517e008a06e3e55d42a5957366d4ad) is now ready to be sponsored by a Committer.
Thanks for the updates. Looks good now!
Thanks! Glad this work can be integrated into jextract.
/integrate
@nizarbenalla Your change (at version 5733943f65702d599da7cbbc1699b38740a531c9) is now ready to be sponsored by a Committer.
/sponsor
Going to push as commit ab6b30fd189e33a52d366846202f2e9b9b280142.
Since your change was applied there has been 1 commit pushed to the master branch:
- ec585cccaa131f429f77df3620325fe465c69ee0: 7903877: jextract exception handling in downcall wrappers
Your commit was automatically rebased without conflicts.
@JornVernee @nizarbenalla Pushed as commit ab6b30fd189e33a52d366846202f2e9b9b280142.
:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.