jextract icon indicating copy to clipboard operation
jextract copied to clipboard

7903933: Move sharable items from different generations to a common file

Open nizarbenalla opened this issue 10 months ago • 11 comments

Please review this patch to move the C_* layouts and the static utility methods into separate classes: LayoutUtils.java and FFMUtils.java, respectively.

  • The names could later be personalized through a JSON configuration.
  • We can use static imports if the -t option is no used and the files are generated into the default package, in that case we use the classname to call the static methods or use the C_* constants.

Some tests had to be modified slightly, either by adding new static imports or replacing classnames.


Progress

  • [x] Change must not contain extraneous whitespace
  • [x] Change must be properly reviewed (no review required)

Issue

  • CODETOOLS-7903933: Move sharable items from different generations to a common file (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jextract.git pull/278/head:pull/278
$ git checkout pull/278

Update a local copy of the PR:
$ git checkout pull/278
$ git pull https://git.openjdk.org/jextract.git pull/278/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 278

View PR using the GUI difftool:
$ git pr show -t 278

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jextract/pull/278.diff

Using Webrev

Link to Webrev Comment

nizarbenalla avatar Feb 03 '25 18:02 nizarbenalla

:wave: Welcome back nbenalla! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Feb 03 '25 18:02 bridgekeeper[bot]

@nizarbenalla This change now passes all automated pre-integration checks.

After integration, the commit message for the final commit will be:

7903933: Move sharable items from different generations to a common file

Reviewed-by: mcimadamore, jvernee

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 1 new commit pushed to the master branch:

  • ec585cccaa131f429f77df3620325fe465c69ee0: 7903877: jextract exception handling in downcall wrappers

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch. As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@mcimadamore, @JornVernee) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

openjdk[bot] avatar Feb 03 '25 18:02 openjdk[bot]

@nizarbenalla This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Mar 03 '25 23:03 bridgekeeper[bot]

I wasn't happy with the previous approach/implementation and started over from scratch, also did some small cleanup (the newly added option for framworks should use -- rather than -).

Shared items from multiple generation will only be moved to a class if the user specifies the --sharable-items option, with the class name of his choosing and the main header file extends that class. This will be helpful for project that need to run jextract on multiple headers, especially if those headers are large.

nizarbenalla avatar Mar 27 '25 05:03 nizarbenalla

Adding an option to move shared items in a separate class feels like taking a shortcut. Let's look what's inside this bag of shared items (it used to be much bigger, so good news there):

  1. library arena
  2. trace downcall support
  3. alignment method
  4. findOrThrow
  5. upcallHandle
  6. primitive layout constants

I believe (1) should probably not belong in a separate class from the headers. The way I see it is that it should belong where the library lookups belongs to. Even if you run multiple extractions it is less clear to me as to whether there should be a single lifetime for all the different libraries or not (and, eventually, we want aspects such as this to be customizable via subclassing).

(2) seems a very useful debug option, and I think we should think about making it more official -- for instance with a linker option.

(3) is a general purpose method that would make sense to feature in the main MemoryLayout API.

(4) seems to be superseded by a similar method that was added in Java 23: https://docs.oracle.com/en/java/javase/23/docs//api/java.base/java/lang/foreign/SymbolLookup.html#findOrThrow(java.lang.String)

(5) seems like it could be inlined in the functional interface classes (it's just an helper to allow clients to create an upcall stub w/o the need to do a MethodHandle lookup -- which needs a try/catch).

Then there's (6). The first reaction I got was: well, whether primitive layouts should be emitted or not seems like another filtering decision (e.g. let's add some options to filter these out). Except this would not work -- it's not just about filtering -- it's also about telling every other file where to find the layouts for such primitive types. If they are defined somewhere else, then jextract need to know where to find them (e.g. if they are referenced by some other layout jextract is building). Which seems a similar problem as the one this PR is trying to solve anyway.

Stepping back -- I think if we look back, jetxract used to generate a RuntimeHelper class, and then everything else. While we moved away from that generation scheme (now we just emit header classes), the shared functionalities are still there. Perhaps a move in a good direction would be to:

  • put all the shared functionalities in the root of the header hierarchy -- and don't put anything else in there
  • this means that there will always be at least two header classes generated foo_f and foo_h$0, where foo_h extends foo_h$0 and all the shared symbols are in foo_h$0.
  • add an option to override the name of that root header class

This is similar, in spirit, to what you have here, but with the advantage that there's only one generation scheme, not two -- e.g. the superclass with the shared symbol is always there -- only sometimes it can have a different name (because the user said so). Whether we want that default superclass name to be foo_h$0, or maybe something more explicit like foo_h_shared, I'm open to suggestions.

But I do think that we should try and make that shared class as small as possible -- most of the functionality there seems like it could belong in the main FFM API, and it would provide value even for clients not running on top of jextract.

mcimadamore avatar Mar 27 '25 10:03 mcimadamore

Then there's (6). The first reaction I got was: well, whether primitive layouts should be emitted or not seems like another filtering decision (e.g. let's add some options to filter these out). Except this would not work -- it's not just about filtering -- it's also about telling every other file where to find the layouts for such primitive types. If they are defined somewhere else, then jextract need to know where to find them (e.g. if they are referenced by some other layout jextract is building). Which seems a similar problem as the one this PR is trying to solve anyway.

Note/history: while tempting, we can't really put (6) inside the main FFM API. We have thought about this for a long time -- the issue is that the types of the primitive layouts is not guaranteed to be stable. E.g. C_LONG might be either a ValueLayout.OfLong or a ValueLayout.OfInt, depending on the platforms. Other layouts, such as C_LONG_DOUBLE might only be available on some platforms and not others. This is why, long ago, we decided that the main FFM API should not concern with providing layout constants for C types -- as the set of such constant is not stable. Instead, such primitive C layouts can be "discovered" using the Linker::canonicalLayouts API.

mcimadamore avatar Mar 27 '25 10:03 mcimadamore

Then there's (6). The first reaction I got was: well, whether primitive layouts should be emitted or not seems like another filtering decision (e.g. let's add some options to filter these out). Except this would not work -- it's not just about filtering -- it's also about telling every other file where to find the layouts for such primitive types. If they are defined somewhere else, then jextract need to know where to find them (e.g. if they are referenced by some other layout jextract is building). Which seems a similar problem as the one this PR is trying to solve anyway.

Also on this topic: for now we're mostly concerned about different extractions not repeating the code for primitive layouts and helper functions. This feels more like a "tip of the iceberg" kind of situation. For instance, you might have two libraries A and B, which both include the header of some third library C. Maybe you want to extract C separately, and then extract A and B so that they somehow magically point at the extracted bindings for C. Now sharing would be not just about primitive types, but about functions, structs and much more.

At the same time, going down this path can be very complex: A and B might pull in slightly different versions of C, or use some #define macro directives which would alter the shape of the generated bindings in C. In which case reusing the same bindings for C would be more difficult.

So, hidden somewhere in here there's a theme of: how do we move jextract to go from a per-extraction set of bindings to a multi-extraction friendly model. And going down this path will likely, I think, result in opening a big and complicated can of worms. I'm not saying we'll never get there -- but there's a limit with what we can express with simple command line options.

mcimadamore avatar Mar 27 '25 10:03 mcimadamore

this means that there will always be at least two header classes generated foo_f and foo_h$0, where foo_h extends foo_h$0 and all the shared symbols are in foo_h$0.

Should we just give the base header a common name then, so that if you generate multiple times, you get sharing automatically? Maybe it could be named something like Builtins.

JornVernee avatar Mar 27 '25 14:03 JornVernee

this means that there will always be at least two header classes generated foo_f and foo_h$0, where foo_h extends foo_h$0 and all the shared symbols are in foo_h$0.

Should we just give the base header a common name then, so that if you generate multiple times, you get sharing automatically? Maybe it could be named something like Builtins.

Yes, see

This is similar, in spirit, to what you have here, but with the advantage that there's only one generation scheme, not two -- e.g. the superclass with the shared symbol is always there -- only sometimes it can have a different name (because the user said so). Whether we want that default superclass name to be foo_h$0, or maybe something more explicit like foo_h_shared, I'm open to suggestions.

mcimadamore avatar Mar 27 '25 15:03 mcimadamore

(4) findOrThrow (4) seems to be superseded by a similar method that was added in Java 23: https://docs.oracle.com/en/java/javase/23/docs//api/java.base/java/lang/foreign/SymbolLookup.html#findOrThrow(java.lang.String)

I meant to remove this when targeting jdk 23, it seems I left it by mistake. I will remove it to reduce the number of shared items we're dealing with.

nizarbenalla avatar Mar 27 '25 16:03 nizarbenalla

I've moved the shared symbols to a different class foo_h$shared that all headers extend from.

The new command line option was renamed to --shared-symbolsto allow users to specify the name of the class if they want to.

nizarbenalla avatar Apr 16 '25 19:04 nizarbenalla

The option for changing header class name is called --header-class-name <name>. So I'd suggest something like --symbols-class-name.

mcimadamore avatar Apr 28 '25 15:04 mcimadamore

I pushed a small test-only update, I was seeing some failures in the CI (that I didn't see locally). This fixes it,

nizarbenalla avatar May 05 '25 12:05 nizarbenalla

Thanks for all the rounds of reviews.

/integrate

nizarbenalla avatar May 06 '25 11:05 nizarbenalla

@nizarbenalla Your change (at version e3b9ae2a21517e008a06e3e55d42a5957366d4ad) is now ready to be sponsored by a Committer.

openjdk[bot] avatar May 06 '25 11:05 openjdk[bot]

Thanks for the updates. Looks good now!

JornVernee avatar May 08 '25 14:05 JornVernee

Thanks! Glad this work can be integrated into jextract.

/integrate

nizarbenalla avatar May 08 '25 14:05 nizarbenalla

@nizarbenalla Your change (at version 5733943f65702d599da7cbbc1699b38740a531c9) is now ready to be sponsored by a Committer.

openjdk[bot] avatar May 08 '25 14:05 openjdk[bot]

/sponsor

JornVernee avatar May 08 '25 15:05 JornVernee

Going to push as commit ab6b30fd189e33a52d366846202f2e9b9b280142. Since your change was applied there has been 1 commit pushed to the master branch:

  • ec585cccaa131f429f77df3620325fe465c69ee0: 7903877: jextract exception handling in downcall wrappers

Your commit was automatically rebased without conflicts.

openjdk[bot] avatar May 08 '25 15:05 openjdk[bot]

@JornVernee @nizarbenalla Pushed as commit ab6b30fd189e33a52d366846202f2e9b9b280142.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk[bot] avatar May 08 '25 15:05 openjdk[bot]