arrow icon indicating copy to clipboard operation
arrow copied to clipboard

WIP: [Java] Require Java 21 to build Arrow

Open laurentgo opened this issue 1 year ago • 12 comments

** This is work in progress **

Rationale for this change

This change demonstrates how to modernize Java toolchain to build Arrow and open opportunities to introduce support for some of the recent Java new features while maintaining compatibility with Java 8.

Java tools used by our build system (like spotless, google-java-format, error-prone, etc.) have dropped support for Java 8 and may do so as well with Java 11 causing the build system to use unsupported versions.

The lack of recent java versions in our toolchains also means that adopting recent Java features (through mechanisms like multi-release jars for example) is currently not possible.

What changes are included in this PR?

This change makes Java 21 the minimum version required to build Arrow while Java 8 remains the minimum Java version required to use Arrow libraries and tools.

For the purpose of demonstrating the end result, this change actually captures multiple changes:

  • Remove direct access to sun.misc.Unsafe and use MethodHandles to invoke methods close to native speed. This is a prerequisite for the next bullet point
  • Compile arrow code base with --release flag, allowing true Java 8 compilation independently of the JDK version used by the build
  • Add support for cross version testing: Introduce a property arrow.test.jdk-version which accepts a JDK version to run unit and integration tests with (the JDK needs to be declared in Maven toolchains first), which is different from the JDK version used to build Arrow
  • Finally make Java 21 the minimum version required to build Arrow and update CI builds accordingly (there are several image changes needed to support Java 21. This change also introduces a new multi-JDK docker image)

Are these changes tested?

Those changes have been tested via github actions as much as possible. I haven't tested with crossbow yet/

Are there any user-facing changes?

No direct user-facing changes but developers will be required to update their Java toolchain.

laurentgo avatar Jun 25 '24 19:06 laurentgo

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

In the case of PARQUET issues on JIRA the title also supports:

PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

See also:

github-actions[bot] avatar Jun 25 '24 19:06 github-actions[bot]

@github-actions crossbow submit -g java

danepitkin avatar Jun 26 '24 20:06 danepitkin

Revision: e31e655858016278a31eb1e0eca71f75270494f1

Submitted crossbow builds: ursacomputing/crossbow @ actions-7d97f6c645

Task Status
java-jars GitHub Actions
test-conda-python-3.10-spark-v3.5.0 GitHub Actions
test-conda-python-3.11-spark-master GitHub Actions
test-conda-python-3.8-spark-v3.5.0 GitHub Actions
verify-rc-source-java-linux-almalinux-8-amd64 GitHub Actions
verify-rc-source-java-linux-conda-latest-amd64 GitHub Actions
verify-rc-source-java-linux-ubuntu-20.04-amd64 GitHub Actions
verify-rc-source-java-linux-ubuntu-22.04-amd64 GitHub Actions
verify-rc-source-java-macos-amd64 GitHub Actions

github-actions[bot] avatar Jun 26 '24 20:06 github-actions[bot]

Looks like crossbow tests are failing (which is expected since I didn't try to run them and modified all of the actions).

I will try and see if I can modify, but I am also trying to gauge the interest for this work, and if yes, if I should decomposing this work into sub issues

laurentgo avatar Jun 26 '24 21:06 laurentgo

So we'd use Java 21, but we'd still be limited to Java 8 features and dependencies, other than maybe things like being able to upgrade ErrorProne now that we don't build with Java 8 - is that right?

lidavidm avatar Jun 26 '24 23:06 lidavidm

So we'd use Java 21, but we'd still be limited to Java 8 features and dependencies, other than maybe things like being able to upgrade ErrorProne now that we don't build with Java 8 - is that right?

Not necessarily. By leveraging MRJAR feature, you could create src/main/java21 directory and instruct the maven-compiler-plugin to use Java21 to build that code and get it packaged inside the jar under META-INF/versions/M/.... Using this mechanism you can now have two implementations of the same class, one for Java 8 up to Java 20, and one for java 21 and higher (the right one being picked up by the JVM automatically)

We could start with module-info.java which are Java9+ files for example...

laurentgo avatar Jun 26 '24 23:06 laurentgo

Ah, I see.

@danepitkin the plan was to drop Java 8 for the 18.0.0 release, right? Or was it 19.0.0?

lidavidm avatar Jun 27 '24 00:06 lidavidm

The current plan is to drop in v18. It would be nice to get some help on this though.

danepitkin avatar Jun 27 '24 05:06 danepitkin

I will try and see if I can modify, but I am also trying to gauge the interest for this work, and if yes, if I should decomposing this work into sub issues

I think this work is valuable and I am at least OK with requiring Java 21 for building. But if we are going to drop JDK8 support entirely soon-ish then it sounds like there's less to be gained from this work.

lidavidm avatar Jun 27 '24 07:06 lidavidm

There are still some long term gains:

  • Some plugins are already considering dropping Java 11 support as the official OpenJDK end of life date is October 2024 (while Java 8 is supported until November 2026...)
  • I believe a couple of recent Java features could be of great interest to the Arrow project:
  • Foreign Function and Memory API: FFI could provide a better solution than JNI for some of adapters and Gandiva. The new memory API could provide a better memory allocator than the current Netty one
  • Vector API: Could possibly help with vectorized operations

But to be able to experiment/introduce support for those new API, a prerequisite is still to have a modern toolchain

laurentgo avatar Jun 27 '24 15:06 laurentgo

Yes, I think this is still helpful - but dropping Java 8 may simplify this?

lidavidm avatar Jul 01 '24 00:07 lidavidm

Yes. it would not be necessary to refactor sun.misc.Unsafe access although it would be still worth it to remove it as a public field from MemoryUtil IMHO

laurentgo avatar Jul 02 '24 15:07 laurentgo

I just rebased on top of current master and remove already merged code and the changes to MemoryUtil. Will update the description as well

laurentgo avatar Jul 29 '24 22:07 laurentgo

@github-actions crossbow submit test-debian-12-docs

danepitkin avatar Aug 13 '24 14:08 danepitkin

Unable to match any tasks for `test-debian-12-docs`
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/10372038877

github-actions[bot] avatar Aug 13 '24 14:08 github-actions[bot]

It seems error-prone is requiring Java 17 for building (can still target Java 11) so we may want to consider this again if other tools are going to do the same thing

lidavidm avatar Oct 08 '24 01:10 lidavidm