pekko icon indicating copy to clipboard operation
pekko copied to clipboard

review Java usage of ActorSystem

Open pjfanning opened this issue 3 months ago • 41 comments

See https://github.com/apache/pekko/pull/1419#issuecomment-2308390446

terminate() returns a Scala Future

In theory, users can ignore the return value from terminate() and call this: https://pekko.apache.org/japi/pekko/current/org/apache/pekko/actor/ActorSystem.html#getWhenTerminated()

Maybe, we could also add terminateWithCompletionStage that returns CompletionStage<Terminated>?

Interestingly, the actor-typed ActorSystem has a void return type on terminate(). You have https://pekko.apache.org/japi/pekko/current/org/apache/pekko/actor/typed/ActorSystem.html#getWhenTerminated() for Java users.

pjfanning avatar Aug 25 '25 17:08 pjfanning

@mdedetrich @raboof @He-Pin My view is that it's not very useful to make changes. Java users can ignore the return value on the class ActorSystem terminate() call and then use the getWhenTerminated() method to get a CompletionStage to check. This maps closely to what they would do with the typed ActorSystem too (where terminate() returns a void anyway).

pjfanning avatar Aug 25 '25 17:08 pjfanning

Not sure about this, there is an Futures.asJava help method, maybe we can add gracefulTerminate(timeout, timeunit) method as Netty?I think that can cover 80% cases

He-Pin avatar Aug 25 '25 17:08 He-Pin

@mdedetrich @raboof @He-Pin My view is that it's not very useful to make changes. Java users can ignore the return value on the class ActorSystem terminate() call and then use the getWhenTerminated() method to get a CompletionStage to check. This maps closely to what they would do with the typed ActorSystem too (where terminate() returns a void anyway).

Ignoring the Future from .terminate() call shouldn't be encouraged, because if you need to do something after the Actor termination then the termination will only occur successful when that Future completes.

When using the java API I actually have to use .toCompletableFuture() because of this, i.e.

fun shutdown() {
    logger.info("Terminating actor system")
    FutureConverters.asJava(system.terminate()).toCompletableFuture().whenComplete { _, exception ->
        if (exception != null) {
            logger.warn("Error terminating actor system", exception)
        } else
            logger.info("Terminated actor system")
    }.get()
}

Its also not consistent, i.e. if you are using pekko http its standard practice to terminate the actor system after you terminate the http server, where again you have to call the .toCompletableFuture() method when using java api

serverBinding.terminate(Duration.ofSeconds(10)).thenCompose {
  FutureConverters.asJava(system.terminate()).toCompletableFuture()
}

its all really quite ugly, I would personally push to have a proper terminate method for the Java API which returns a CompletableFuture. 2.0.x allows for breaking changes, so now is the opportunity to do such a change

mdedetrich avatar Aug 25 '25 22:08 mdedetrich

@mdedetrich I would prefer that we keep the existing ActorSystem def terminate(): Future[Terminated]. Can we just add def terminateWithCompletionStage(): CompletionStage[Terminated]?

pjfanning avatar Aug 25 '25 22:08 pjfanning

@mdedetrich I would prefer that we keep the existing ActorSystem def terminate(): Future[Terminated].

Well one way to solve this issue would be to provide an API that accepts an ActorSystem as an argument, that way you would have a pekko.scaladsl.actor.Terminate(actorSystem: ActorSystem) which will return Future[Terminated] and you would have a pekko.actor.javadsl.Terminate(actorSystem: ActorSystem)which will return aCompletionStage[Terminated]`.

There are even more natural options than this, one is to make our current ActorSystem private/internal stable API (for 2.0.x), and create an ActorSystem in both the javadsl and scaladsl package that inherits the current ActorSystem and have all of the Future returning methods on the scaladsl Actor and all of the CompletionStage ones on the javadsl actor. Another alternative is to use the delegate pattern which is also frequently used in pekko codebase.

In conjunction you would also deprecate the current def terminate(): Future[Terminated] on ActorSystem in 1.2.x.

I know that this is more work, however the counter argument to that is that the current terminate() method is clearly inconsistent with how the rest of the pekko API's are structured (where the api's are split between the scaladsl and javadsl packages respectively) and hence its current design seems to be a mistake/oversight?

Can we just add def terminateWithCompletionStage(): CompletionStage[Terminated]?

This would be out of place in terms of the rest of the pekko api's are designed.

I have a bit of time at the end of the week, I can create a PR with one of the implementations I suggested (likely would be inheritence version since ActorSystem is abstract anyways and as a bonus these changes wouldn't break binary compatibility so they can be implemented in 1.2.x (and we would only make the abstract ActorSystem private in 2.0.x). We would howeer have to wait for the 1.2.x release a bit longer in such as a case as I would have to mark some current methods for deprecation

mdedetrich avatar Aug 25 '25 22:08 mdedetrich

I'm not at all sure that I like a major refactor just because a Java friendly terminate isn't implemented. Can we at least review the API on the typed ActorSystem which is different from the class ActorSystem? The diff being that terminate returns void/unit. And then users need to call getWhenTerminated() (Java) or whenTerminated() (Scala) to get a CompletionStage or Future as required. These 2 methods are on the classic ActorSystem too.

I think we should try to still bear in mind that it is good to keep the disruption to migrating users to a minimum.

pjfanning avatar Aug 25 '25 23:08 pjfanning

I'm not at all sure that I like a major refactor just because a Java friendly terminate isn't implemented. Can we at least review the API on the typed ActorSystem which is different from the class ActorSystem? The diff being that terminate returns void/unit. And then users need to call getWhenTerminated() (Java) or whenTerminated() (Scala) to get a CompletionStage or Future as required. These 2 methods are on the classic ActorSystem too.

I think we should try to still bear in mind that it is good to keep the disruption to migrating users to a minimum.

The thing is its not just terminate that returns a Future, as you stated there is also whenTerminated but that is still sticking out as a sore thumb. But you are right in that I should look into the typed ActorSystem and see the reasoning/intent behind it returning Void/Unit (I guess they made that method blocking to make it easier to use, although if thats the case that is a not very akka/pekko like???)

Can we hold off on the 1.2.x release until I look into this properly? There is another reason behind this change but I first need to look into how many divergent API's we have (there is a big difference if the only divergence we have is these 2 methods vs a lot of others).

mdedetrich avatar Aug 25 '25 23:08 mdedetrich

@pjfanning @mdedetrich Add a terminateWithCompletionStage is ok too, as there is something like that everywhere:

  def invokeWithFeedbackCompletionStage(t: T): CompletionStage[Done] = {
    import pekko.util.FutureConverters._
    invokeWithFeedback(t).asJava
  }

He-Pin avatar Aug 26 '25 02:08 He-Pin

If we want to do this in a great way for Java, we should reimplement all Java interfaces under javadsl and expose no Scala types, which can be done in 2.0.0, and provide asJava and asScala to convert bidirectionally.

And this method is being used very low frequently, so I suggest we just release the 1.2.0 and then do a better shaping in 2.0.0

A great Java api will need to restructure the current api/implementation.

He-Pin avatar Aug 26 '25 02:08 He-Pin

@mdedetrich separating two javadsl and scaladsl will cause some problems too, when you do searching for the same type in the IDE, then two classes show up, you have to select one, select two methods, or select two classes.

As the terminate method is being used very low frequency, we can just leave it as it is.

in 1.2.0:

  • leave it as it is

in 2.0.0

  • change the result type to Unit and ask users to use whenTerminate/getWhenTerminated

@mdedetrich @pjfanning

He-Pin avatar Aug 26 '25 03:08 He-Pin

@pjfanning @mdedetrich Add a terminateWithCompletionStage is ok too, as there is something like that everywhere:

  def invokeWithFeedbackCompletionStage(t: T): CompletionStage[Done] = {
    import pekko.util.FutureConverters._
    invokeWithFeedback(t).asJava
  }

This is just a bandaid. The ideal state is that for the javadsl, no method accepts/returns a Future and for a scaladsl no method accepts/returns a CompletionStage.

I will work on a PR later on in the week and also explain my intentions (I want to work on a kotlindsl and that only works when we have properly split out dsl's)

mdedetrich avatar Aug 26 '25 07:08 mdedetrich

As the terminate method is being used very low frequency, we can just leave it as it is.

Where is the evidence of this? In order to properly terminate the actor system without creating spurious warnings and/or make sure its properly shut down in terms of ungraceful shutdown, you should use terminate(). Every akka/pekko codebase I have worked on uses it.

mdedetrich avatar Aug 26 '25 07:08 mdedetrich

@mdedetrich separating two javadsl and scaladsl will cause some problems too, when you do searching for the same type in the IDE, then two classes show up, you have to select one, select two methods, or select two classes.

Considering that almost every single other part of the akka/pekko codebase does this where you have this problem, we are so far down that I fail to see how this is a good counter argument

In pekko http, almost every single type/class has a Scala and a Java equivalent

mdedetrich avatar Aug 26 '25 07:08 mdedetrich

Where is the evidence of this

  1. The test cases and the result contains less info, that's why the typed actor system's terminate() function just returns a void.
  2. When can be retrieved with whenTerminated, if care
  3. A user actually cares about gracefulShutdown, not when it shuts down, that's why there is a gracefulShutdown in Netty's EventLoopGroup.

For these simple cases, we do not need to duplicate another ActorSystem just for Java, really, this terminate() call will only be used 1~2 times in your codebase.

He-Pin avatar Aug 26 '25 07:08 He-Pin

@mdedetrich I like your idea, and that's an idea world, btw, I think we should polish the best UX in Java/Kotlin/Scala. seems you are using Kotlin these days? The trailing lambda is cool, which means some javadsl methods should be rewritten for Kotlin friendly.

He-Pin avatar Aug 26 '25 08:08 He-Pin

Where is the evidence of this

  1. The test cases and the result contains less info, that's why the typed actor system's terminate() function just returns a void.

Test cases don't care about graceful shutdown so this is irrelevant here

  1. When can be retrieved with whenTerminated, if care

Right, but if you add a blocking terminated method as you did in https://github.com/apache/pekko/pull/2096, whenTerminated because more annoying to use as you now have to put terminateAndAwait inside a Future in its own dispatcher/blocking.

Terminate is fundamentally an asyc operation, thats why it returns a Future.

  1. A user actually cares about gracefulShutdown, not when it shuts down, that's why there is a gracefulShutdown in Netty's EventLoopGroup.> I know, and graceful shutdown with Akka is done with terminate! Now its true that having a shutdown happen so quickly that it concludes before terminate finishes is rare, but I have seen it happen. For these simple cases, we do not need to duplicate another ActorSystem just for Java, really, this terminate() call will only be used 1~2 times in your codebase. Agreed, but when I said everywhere I meant the amount of projects, not the amount of times per project. It pretty much should be used 100% of the time once for every project, but there are exceptions

@mdedetrich I like your idea, and that's an idea world, btw, I think we should polish the best UX in Java/Kotlin/Scala. seems you are using Kotlin these days? The trailing lambda is cool, which means some javadsl methods should be rewritten for Kotlin friendly.

Yes, and with kotlin you don't use CompletionStage/Future but rather co-routine with suspend fun and as you can see, when I start working on a kotlindsl this is going to just end up polluting the ActorSystem with methods for every type of dsl we could want to support.

When I said what ActorSystem is doing here is an exception, I meant it. Also I think its fairly easy to solve this with minimal disruption, the create() method would just return an ActorSystem from either java/scaladsl but the source code will be exactly the same as now. You can even do this in a binary compatible way.

Please just wait till the end of the week, ill show you what I mean.

mdedetrich avatar Aug 26 '25 08:08 mdedetrich

I don't think we should delay the 1.2.0 release. We can always do a new 1.x release in the future if and when we want to deprecate methods in 1.x in anticipation of 2.0.0.

I don't think we should merge any changes related to this issue while the change itself remains under debate.

pjfanning avatar Aug 26 '25 08:08 pjfanning

I don't think we should delay the 1.2.0 release. We can always do a new 1.x release in the future if and when we want to deprecate methods in 1.x in anticipation of 2.0.0.

Sure, I just think that these changes we are talking about are not that complicated so it can be done pretty quickly, but if its not we can always do a 1.3.0

I don't think we should merge any changes related to this issue while the change itself remains under debate.

Agreed, let me open up the PR just to demonstrate what I am talking about.

mdedetrich avatar Aug 26 '25 08:08 mdedetrich

because more annoying to use as you now have to put terminateAndAwait inside a Future in its own dispatcher/blocking. @mdedetrich Noop, I just need to hook it in Spring's @PreDestory

For Kotlin, I think you should add a new module, which only contains the kotlindsls, and there is no need to change the current shape, because you can forward the call to Java.

He-Pin avatar Aug 26 '25 08:08 He-Pin

@mdedetrich Actually, I'm using Kotlin at $work too, so let's roll 1.2.0 out first. If needed, we can release another 1.3.0 for deprecation when we are developing 2.0.0

He-Pin avatar Aug 26 '25 08:08 He-Pin

I'm not at all sure that I like a major refactor just because a Java friendly terminate isn't implemented.

I'm still skeptical as well - I suspect consistently splitting ActorSystem into a javadsl and a scaladsl variant will uncover some tricky situations and we'll have to carefully consider whether these are 'worth it' - but can't hurt to give it a try. Just don't get tunnel-visioned into this experiment.

I think ActorSystem is 'special' enough that it's OK to be a bit different from the rest, and it's not the only class that is shared between java and scala dsl.

Can we at least review the API on the typed ActorSystem which is different from the class ActorSystem? The diff being that terminate returns void/unit. And then users need to call getWhenTerminated() (Java) or whenTerminated() (Scala) to get a CompletionStage or Future as required. These 2 methods are on the classic ActorSystem too.

The thing is its not just terminate that returns a Future, as you stated there is also whenTerminated but that is still sticking out as a sore thumb.

It's a well-established pattern that, for classes that are shared between javadsl and scaladsl, the Java API uses methods of the form 'getXxx' where the Scala API uses plain xxx. This is also documented as https://github.com/apache/pekko/blob/main/CONTRIBUTING.md#java-apis-in-pekko point 16. It's too bad that we don't have an equivalent for 'actions', but I don't see an obvious candidate.

But you are right in that I should look into the typed ActorSystem and see the reasoning/intent behind it returning Void/Unit (I guess they made that method blocking to make it easier to use, although if thats the case that is a not very akka/pekko like???)

It's not blocking - the docs even say so ("This is an asynchronous operation and completion of the termination can be observed with [[ActorSystem.whenTerminated]] or [[ActorSystem.getWhenTerminated]]."). I suspect the return value was dropped in favor of those two getters precisely because this makes it possible to have separate Java and Scala APIs for watching the termination.

Since we're thinking 2.0.0 anyway: perhaps ActorSystem should implement https://docs.oracle.com/javase/8/docs/api/java/io/Closeable.html, and terminate should be deprecated in favor of close and whenTerminated/getWhenTerminated.

Can we hold off on the 1.2.x release until I look into this properly?

I don't see any reason for that.

raboof avatar Aug 26 '25 09:08 raboof

I'm still skeptical as well - I suspect consistently splitting ActorSystem into a javadsl and a scaladsl variant will uncover some tricky situations and we'll have to carefully consider whether these are 'worth it' - but can't hurt to give it a try. Just don't get tunnel-visioned into this experiment.

I think its quite safe because in reality the only changes will be moving the terminate (and related methods) out of the current base ActorSystem class but the rest will remain the same.

But yes you are right, we should try this out.

Since we're thinking 2.0.0 anyway: perhaps ActorSystem should implement https://docs.oracle.com/javase/8/docs/api/java/io/Closeable.html, and terminate should be deprecated in favor of close and whenTerminated/getWhenTerminated.

This is a good point! If we are going to add Unit/Void based terminate like methods to make it easier, may as well do it in the idiomatic way and implement Java's closable interface rather than calling it terminateAndAwait as we did at https://github.com/apache/pekko/pull/2096

mdedetrich avatar Aug 26 '25 11:08 mdedetrich

So as an update, I did manage to get something working locally and by that I mean changes for 1.2.x to add scaladsl/javadsl for actors without breaking any users. The implementation is a bit gnarly, but a follow up for 2.0.x would make the code a lot cleaner as a lot of code would be removed.

However I agree that 1.2.x shouldn't be blocked because of this (such a change should be in 1.3.x) and also it might make sense to release a pekko 2.0.0 milestone without this actor cahnge so that we don't have so many changes in the first milestone release as we need to do downstream changes to all of the pekko satellite projects. We can do then a 2.0.0-M2 follow up release with the actor changes to make the work more iterative

@raboof @He-Pin @pjfanning

Thoughts?

mdedetrich avatar Sep 06 '25 08:09 mdedetrich

@mdedetrich I was trying to do https://github.com/apache/pekko/issues/2152, but that seems to need rework of many api too. I suggest you just work on the main, at least for me, I did not expect a 1.3.x branch.

He-Pin avatar Sep 06 '25 08:09 He-Pin

@mdedetrich I was trying to do #2152, but that seems to need rework of many api too. I suggest you just work on the main, at least for me, I did not expect a 1.3.x branch.

The thing is, if we want to do breaking changes in 2.0.x series (such as removing the methods that work with Future for Java users) the changes have to land in Pekko 1.x series because we have to then deprecate those methods.

My work is currently based on the 1.2.x branch, but while the changes are mechanically simple it is in terms of volume a big PR also because documentation needs to be updated. There are also open questions about how far we want to go, i.e. do we want to follow the pekko-connectors/http pattern where even if the methods are the same for scaladsl/javadsl we still have copies of those methods in both API's (i.e. they are just alias's in the respective dsl packages) OR do we have these common API's just sit in org.apache.pekko.actor (one nesting above org.apache.pekko.actor.scaladls/org.apache.pekko.actor/javadsl), I wanted to ask this question in the PR.

The tl;dr is, if we want Pekko 2.0.x to be clean and have a consistent API with the rest of pekko these changes need to land in 1.x at some point.

On another note I also looked into the git log and I know why the API for Actor is the way it is, its REALLY old. It actually predates even Java's CompletableFuture/CompletionStage (which only came out in Java 1.8, see https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletableFuture.html and https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletionStage.html) which explains why the API was designed the way it was, at that time CompletableFuture/CompletionStage didn't exist and so they expected Java users to just have to deal with the Scala Future. Java's Future was added in 1.5, but it was largely useless as it was missing the other necessary abstractions such as CompletableFuture/CompletionStage.

I actually think for this reason, 2.0.x is an excellent opportunity to be bold and just remove all of this legacy design and cruft. Whether it be dropping Scala 2.12 (which seems likely), JVM 17 as minimum version which lets us simplify the sbt build a bit or just updating the API's (as we are talking about in this issue) so that both Java and Scala users are happy and also opens the doors for other types of dsl's (i.e. kotlin, clojure if someone is crazy enough????). This will also make our job easier in maintaining pekko, with the limited capacity we have as open source contributors.

We just need to make sure that users are not hanging by waiting extended periods of time for a 2.0.0 release, hence why I think releasing semi-frequent milestones for 2.0.0 is the strategy to go here. This is one reason milestones are designed for, especially when we are doing binary breaking changes due to bump in major versions.

mdedetrich avatar Sep 06 '25 09:09 mdedetrich

if we want to do breaking changes in 2.0.x series (such as removing the methods that work with Future for Java users) the changes have to land in Pekko 1.x series because we have to then deprecate those methods.

Can you remind me why that is?

raboof avatar Sep 06 '25 09:09 raboof

My preference is for changes like this to start on the main branch. When they are agreed and merged, we can then decide what if anything we would like to backport. This is actually how we normally operate even with small bug fixes.

pjfanning avatar Sep 06 '25 09:09 pjfanning

if we want to do breaking changes in 2.0.x series (such as removing the methods that work with Future for Java users) the changes have to land in Pekko 1.x series because we have to then deprecate those methods.

Can you remind me why that is?

Afaik technically speaking with semver its not an issue either way, but as good custodians it makes sense that if we are going to break the API in 2.0.x, that those breakages should be marked as deprecated in 1.x series so that users are aware and the migration is easier, this is at least what @pjfanning wanted earlier if I understood correctly (see https://github.com/apache/pekko/issues/2093#issuecomment-3221993046).

Its not any different to how we just broke the API by removing all of those deprecated methods, the point being that those methods needed to be deprecated first so that users are aware of how to migrate away from those API's as they need to be removed. And while some of those methods were deprecated for a while, as part of the 2.0.x work we did recently deprecate methods in 1.2.x because they are going to be removed in 2.0.0.

Its of course easier to just do all of the API changes in 2.0.x without caring about 1.x, but then it would be a big shock to users.

My preference is for changes like this to start on the main branch. When they are agreed and merged, we can then decide what if anything we would like to backport. This is actually how we normally operate even with small bug fixes.

If we want the new API to be usable in 1.x and the migration to be easier (as said in https://github.com/apache/pekko/issues/2093#issuecomment-3221993046) I need to do the changes in 1.x series first because that would then act as a stepping stone to 2.x. Basically there needs to be a migration path in 1.x series for the current actor API to the new one so users can migrate in the 1.x series to the new API, and then in 2.0.0 we remove the old (ergo 1.x) actor API and just leave the javadsl/scaladsl one.

I don't have an issue in working on main, its actually much easier but then we wouldn't have a nice migration path for users

mdedetrich avatar Sep 06 '25 09:09 mdedetrich

Its of course easier to just do all of the API changes in 2.0.x without caring about 1.x, but then it would be a big shock to users.

Would it? I had assumed it would be mostly source-compatible, perhaps mainly changing some imports?

I don't have an issue in working on main, its actually much easier but then we wouldn't have a nice migration path for users

I think we have to answer 2 questions: 1) "do we want this change at all", 2) "how do we make this change as smooth as possible for users". I think it would be fine to first work on main so we can answer '1'. If based on that we decide we do want this change in 2.x, then we can see if there's anything we can/should do on the 1.x branch to make the transition smoother.

raboof avatar Sep 06 '25 10:09 raboof

Its of course easier to just do all of the API changes in 2.0.x without caring about 1.x, but then it would be a big shock to users.

Would it? I had assumed it would be mostly source-compatible, perhaps mainly changing some imports?

Mostly isn't entirely, and there are more changes aside from imports i.e. ActorSystem.terminate for java api is not returning a scala Future at all so that isn't source compatible.

I don't have an issue in working on main, its actually much easier but then we wouldn't have a nice migration path for users

I think we have to answer 2 questions: 1) "do we want this change at all", 2) "how do we make this change as smooth as possible for users". I think it would be fine to first work on main so we can answer '1'. If based on that we decide we do want this change in 2.x, then we can see if there's anything we can/should do on the 1.x branch to make the transition smoother.

Right, and in my belief the best strategy for figuring out 1 (while also caring about source compatibility) is to actually work on 1.2.x first as I need to actually figure out if its possible.

If we don't care about complete source compatibility, then of course I can do a very clean implementation right now on main right now but I was under the impression that we want an easy migration path which means that for 1.3.x (assuming the actor changes get released then) users get deprecation warnings to change to the javadsl/actordsl API's but the current actor system works as is, and in 2.0.x the current common actor API won't exist anymore.

mdedetrich avatar Sep 06 '25 11:09 mdedetrich