WIP: [Java] Require Java 21 to build Arrow
** This is work in progress **
Rationale for this change
This change demonstrates how to modernize Java toolchain to build Arrow and open opportunities to introduce support for some of the recent Java new features while maintaining compatibility with Java 8.
Java tools used by our build system (like spotless, google-java-format, error-prone, etc.) have dropped support for Java 8 and may do so as well with Java 11 causing the build system to use unsupported versions.
The lack of recent java versions in our toolchains also means that adopting recent Java features (through mechanisms like multi-release jars for example) is currently not possible.
What changes are included in this PR?
This change makes Java 21 the minimum version required to build Arrow while Java 8 remains the minimum Java version required to use Arrow libraries and tools.
For the purpose of demonstrating the end result, this change actually captures multiple changes:
- Remove direct access to
sun.misc.Unsafeand useMethodHandles to invoke methods close to native speed. This is a prerequisite for the next bullet point - Compile arrow code base with
--releaseflag, allowing true Java 8 compilation independently of the JDK version used by the build - Add support for cross version testing: Introduce a property
arrow.test.jdk-versionwhich accepts a JDK version to run unit and integration tests with (the JDK needs to be declared in Maven toolchains first), which is different from the JDK version used to build Arrow - Finally make Java 21 the minimum version required to build Arrow and update CI builds accordingly (there are several image changes needed to support Java 21. This change also introduces a new multi-JDK docker image)
Are these changes tested?
Those changes have been tested via github actions as much as possible. I haven't tested with crossbow yet/
Are there any user-facing changes?
No direct user-facing changes but developers will be required to update their Java toolchain.
Thanks for opening a pull request!
If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose
Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.
Then could you also rename the pull request title in the following format?
GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}
or
MINOR: [${COMPONENT}] ${SUMMARY}
In the case of PARQUET issues on JIRA the title also supports:
PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}
See also:
@github-actions crossbow submit -g java
Revision: e31e655858016278a31eb1e0eca71f75270494f1
Submitted crossbow builds: ursacomputing/crossbow @ actions-7d97f6c645
Looks like crossbow tests are failing (which is expected since I didn't try to run them and modified all of the actions).
I will try and see if I can modify, but I am also trying to gauge the interest for this work, and if yes, if I should decomposing this work into sub issues
So we'd use Java 21, but we'd still be limited to Java 8 features and dependencies, other than maybe things like being able to upgrade ErrorProne now that we don't build with Java 8 - is that right?
So we'd use Java 21, but we'd still be limited to Java 8 features and dependencies, other than maybe things like being able to upgrade ErrorProne now that we don't build with Java 8 - is that right?
Not necessarily. By leveraging MRJAR feature, you could create src/main/java21 directory and instruct the maven-compiler-plugin to use Java21 to build that code and get it packaged inside the jar under META-INF/versions/M/.... Using this mechanism you can now have two implementations of the same class, one for Java 8 up to Java 20, and one for java 21 and higher (the right one being picked up by the JVM automatically)
We could start with module-info.java which are Java9+ files for example...
Ah, I see.
@danepitkin the plan was to drop Java 8 for the 18.0.0 release, right? Or was it 19.0.0?
The current plan is to drop in v18. It would be nice to get some help on this though.
I will try and see if I can modify, but I am also trying to gauge the interest for this work, and if yes, if I should decomposing this work into sub issues
I think this work is valuable and I am at least OK with requiring Java 21 for building. But if we are going to drop JDK8 support entirely soon-ish then it sounds like there's less to be gained from this work.
There are still some long term gains:
- Some plugins are already considering dropping Java 11 support as the official OpenJDK end of life date is October 2024 (while Java 8 is supported until November 2026...)
- I believe a couple of recent Java features could be of great interest to the Arrow project:
- Foreign Function and Memory API: FFI could provide a better solution than JNI for some of adapters and Gandiva. The new memory API could provide a better memory allocator than the current Netty one
- Vector API: Could possibly help with vectorized operations
But to be able to experiment/introduce support for those new API, a prerequisite is still to have a modern toolchain
Yes, I think this is still helpful - but dropping Java 8 may simplify this?
Yes. it would not be necessary to refactor sun.misc.Unsafe access although it would be still worth it to remove it as a public field from MemoryUtil IMHO
I just rebased on top of current master and remove already merged code and the changes to MemoryUtil. Will update the description as well
@github-actions crossbow submit test-debian-12-docs
Unable to match any tasks for `test-debian-12-docs`
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/10372038877
It seems error-prone is requiring Java 17 for building (can still target Java 11) so we may want to consider this again if other tools are going to do the same thing