lmdbjava icon indicating copy to clipboard operation
lmdbjava copied to clipboard

Java 25, the Foreign Function & Memory API and removal of Unsafe

Open at055612 opened this issue 2 months ago • 15 comments

@stroomdev66 and I (@at055612) have recently been added as committers on this project to help out @benalexau, having been long time users of LmdbJava and occasional contributors. We actively use LmdbJava in a number of different use cases in our OSS project and are very grateful to @benalexau, @krisskross and all the other contributors for a great library.

Now that Java 25 and its new Foreign Function and Memory API are in general availability, it seems sensible to incorporate them into LmdbJava. This has been a long anticipated change (see #42). We are therefore keen to make changes to LmdbJava to incorporate the FFM API and remove the use of Unsafe. @stroomdev66 has done some initial experimentation with replacing jnr-ffi with the new FFM API.

Such changes need to be done in a way that does not cause too much pain for the LmdbJava user community. We have had some initial exploratory conversations with @benalexau about the way forward and this is a current rough plan.


LmdbJava v1.x

  • Maintain the current Java compile target of Java 8.
  • Address the two outstanding PRs:
    • #250 - This one is mine and needs to be finished to address comparator issues introduced in 0.9.0. It will mean the removal of MDB_UNSIGNEDKEY that was added in 0.9.1 to partially address 0.9.0.
    • #255
  • Uplift any existing dependencies, including LMDB.

Users that are not in a position to move to Java 25 can continue to use v1.x with no change.

LmdbJava v2.x

  • Change the Java compile target to Java 25.
  • Change the internal LmdbJava code to use the FFM API for interaction with LMDB.
  • Change Env to create and hold on to an Arena instance for its life. This will be used for creating MemorySegments internally.
  • Change the internal LmdbJava code to use MemorySegments when interacting with LMDB.
  • Change the existing BufferProxy implementations to map the buffers to MemorySegments.
  • Add a new MemorySegmentProxy class so that client code can fully exploit MemorySegment, e.g. Env<MemorySegment>.
  • Remove any use of Unsafe.

This seems like the best way to use the FFM API without causing a lot of breaking change for users of LmdbJava. It may come at the cost of some performance penalty due to mapping between buffer implementations and MemorySegment. There may also be performance differences between MemorySegment and Unsafe buffer access.

This work is currently in progress on this branch: https://github.com/lmdbjava/lmdbjava/tree/java-foreign-function-and-memory-api

and there is a draft PR for the work here #258.

Users that can migrate to Java 25, but have a lot of existing code built around Netty/Agrona/DirectByteBuffer can use this with hopefully no or minimal change.

LmdbJava v3.x

  • Include all of the v2.x work.
  • Remove the generic <T> type from Env/Dbi/Txn, etc.
  • Change the LmdbJava API to exclusively use MemorySegment rather than T.
  • Remove the netty and agrona dependencies.
  • Remove BufferProxy and all its implementations.

We think that ultimately MemorySegment is the natural replacement of the various buffer implementations. It offers a much richer API for working with ranges of memory. Therefore it seems sensible for MemorySegment to be the only mechanism for passing data in/out of LMDB.

We assume that for people that use Netty and Agrona in their projects, that Netty/Agrona will in time improve their libraries to integrate with MemorySegments so that LmdbJava users can map back/forth themselves.

Removing the various buffer implementations will make the LmdbJava code simpler and make it easier for the LmdbJava maintainers to maintain the project and support users' issues.

Users who can migrate to Java 25 and have little or no legacy buffer based code can move to this version.


We welcome any comments or suggestions that people have on the Java25/FFM/Unsafe changes and/or the general direction of LmdbJava development.

at055612 avatar Oct 24 '25 13:10 at055612

Mentioning a few recent contributors in case they have input @wardle @cdprete @pretecd @2018ik

at055612 avatar Oct 24 '25 13:10 at055612

Hey @at055612, do you have some numbers about how much the performances will be impacted?

cdprete avatar Oct 24 '25 14:10 cdprete

The code I just posted which is a very rough first attempt looks about 30% slower. However I am in the process of profiling and improving the code in several ways to address this. Having said that I think we will be looking at a 5% impact for the core movement to the FFM API, presumably incurred due to the extra safety imposed. The rest is currently due to the removal of Unsafe and the high performance provided by reuse of buffers and repointing of memory addresses.

On a side note, I have noticed when profiling some additional areas in the existing codebase that have some unwanted overhead. I'm in the process of looking into these so perhaps the gap can be closed in other ways for some use cases.

stroomdev66 avatar Oct 24 '25 14:10 stroomdev66

One area of concern to @at055612 and I when exposing MemorySegments is the 2-byte alignment of LMDB. We have discussed this between ourselves at great length and believe that users of memory segments will need to be careful when crafting structs etc.

stroomdev66 avatar Oct 24 '25 14:10 stroomdev66

The code I just posted which is a very rough first attempt looks about 30% slower. However I am in the process of profiling and improving the code in several ways to address this. Having said that I think we will be looking at a 5% impact for the core movement to the FFM API, presumably incurred due to the extra safety imposed. The rest is currently due to the removal of Unsafe and the high performance provided by reuse of buffers and repointing of memory addresses.

On a side note, I have noticed when profiling some additional areas in the existing codebase that have some unwanted overhead. I'm in the process of looking into these so perhaps the gap can be closed in other ways for some use cases.

That's a lot. I hope it won't end up like this, to be honest.

cdprete avatar Oct 24 '25 14:10 cdprete

Unsafe hasn't been removed yet in J25 so it might be necessary for us to keep it for the time being until perople can migrate to direct use of MemorySegment in order to maintain performance. We will do our best to reduce the overhead in whatever way we can. I am confident that the 30% gap will be reduced significantly with a bit of effort.

stroomdev66 avatar Oct 24 '25 14:10 stroomdev66

Even the 5% is a lot. In a time window of 1m means 3s of delay, that for us translates to 3M-9M messages being delayed.

cdprete avatar Oct 24 '25 15:10 cdprete

Hopefully once we have something that can be released in alpha form, people can do their own performance testing and profiling with their own usage patterns to help find areas that need improvement. There are so many variables when it comes to performance testing something like LmdbJava (size of keys/values, puts vs gets, read/write ratio, cursor iteration, etc.), it will not be possible to definitively say if its slower/faster/same.

at055612 avatar Oct 24 '25 15:10 at055612

I might be worrying people unnecessarily. It might be more like 1% when using MemorySegment directly and 22% when using ByteByffers without Unsafe. This picture is improving all the time as I make further improvements.

stroomdev66 avatar Oct 24 '25 15:10 stroomdev66

Would there be any benefit in highlighting this to the JVM FFI team? lmdb and lmdbjava seem to me to be an excellent example of making use of battle-tested native code (although I suspect I am biased) but performance regressions are pretty difficult to defend.

wardle avatar Oct 24 '25 17:10 wardle

Hopefully once we have something that can be released in alpha form, people can do their own performance testing and profiling with their own usage patterns to help find areas that need improvement. There are so many variables when it comes to performance testing something like LmdbJava (size of keys/values, puts vs gets, read/write ratio, cursor iteration, etc.), it will not be possible to definitively say if its slower/faster/same.

That would be quite an effort given that 2 versions will be fully incompatible and, therefore, people (like myself) have to potentially rewrite almost completely their persistence layer.

I understand why you want to move forward in the direction you've taken, don't get me wrong here, but the main selling point of LMDB is its speed. If this drops below a certain threshold (depending on every use case), there will be no point in using it instead of using, for example, RocksDB. Especially with all the limitations LMDB has when compared with RocksDB.

cdprete avatar Oct 24 '25 18:10 cdprete

So we use LmdbJava for the same reason that everybody else does; blazing fast performance. We have 0% intention of decreasing performance by even 1%. It would be bad for us and bad for the community.

What we have started is an initial investigation into a move to the Java FFM API for users of Java 25+. There is much to learn and many changes that might be needed to ensure performance is maintained. At some point Unsafe will be removed from the JDK so it makes sense to start thinking about that impact too.

We have published the first experiment to demonstrate to the community that these problems are being considered. This stimulates discussion, and potentially avoids somebody else wasting time duplicating the effort. There is much to do and we will be testing, profiling and improving code over the coming months.

My initial throw away remarks on performance were to discourage anybody from thinking there was code here that is good to go, it is not, and I do not wish for somebody to use it in any way for a production system.

We will be publishing and maintaining a 1.x release for the foreseeable future that incorporates any fixes, library upgrades or other maintenance needed. It might also end up with other improvements if we learn some lessons from the new investigation. 2.x will only be published if we get to a satisfactory point. This may indeed require some discussion with the Java FFM team.

We aim to do the right thing for everybody so hopefully we will get there.

stroomdev66 avatar Oct 24 '25 23:10 stroomdev66

So we use LmdbJava for the same reason that everybody else does; blazing fast performance. We have 0% intention of decreasing performance by even 1%. It would be bad for us and bad for the community.

What we have started is an initial investigation into a move to the Java FFM API for users of Java 25+. There is much to learn and many changes that might be needed to ensure performance is maintained. At some point Unsafe will be removed from the JDK so it makes sense to start thinking about that impact too.

We have published the first experiment to demonstrate to the community that these problems are being considered. This stimulates discussion, and potentially avoids somebody else wasting time duplicating the effort. There is much to do and we will be testing, profiling and improving code over the coming months.

My initial throw away remarks on performance were to discourage anybody from thinking there was code here that is good to go, it is not, and I do not wish for somebody to use it in any way for a production system.

We will be publishing and maintaining a 1.x release for the foreseeable future that incorporates any fixes, library upgrades or other maintenance needed. It might also end up with other improvements if we learn some lessons from the new investigation. 2.x will only be published if we get to a satisfactory point. This may indeed require some discussion with the Java FFM team.

We aim to do the right thing for everybody so hopefully we will get there.

Fair enough.

From my side, I wanted to make sure the main goal of why the library exists doesn't get lost just for the sake of moving to FFM.

cdprete avatar Oct 25 '25 07:10 cdprete

Interesting reading on the performance differences between FFM vs Unsafe

https://inside.java/2025/06/12/ffm-vs-unsafe/

at055612 avatar Oct 27 '25 08:10 at055612

See https://github.com/netty/netty/wiki/Java-24-and-sun.misc.Unsafe regarding native access. It may help here ;)

cdprete avatar Dec 09 '25 18:12 cdprete