otp Actor's `init` taking longer than `init_timeout` causes entire program to crash

Like the title describes, I've run into an issue where if the inittakes too long instead of returning an error then actor.start_spec will crash the entire program with the following error:

Runtime terminating during boot (killed)

Crash dump is being written to: erl_crash.dump...done

I'm using Erlang/OTP 27 and have tried on Erlang/OTP 25 with the same result.

Below is a code snippet that demonstrates the issue.

import gleam/erlang/process
import gleam/option
import gleam/otp/actor

pub type State {
  State(Int)
}

pub type Message {
  Increment
}

pub fn main() {
  start_actor()

  process.sleep_forever()
}

pub fn start_actor() {
  let spec: actor.Spec(State, Message) =
    actor.Spec(init:, init_timeout: 10, loop:)

  actor.start_spec(spec)
}

pub fn init() {
  process.sleep(1_000_000)
  actor.Ready(State(0), process.new_selector())
}

pub fn loop(_message, state) {
  actor.Continue(state, option.None)
}

Aug 27 '24 18:08 brettkolodny

Hello! the program is crashing as the linked child process is timing out and it is not supervised, so the exit continues to propagates until it reaches a process trapping exits (there is not one in this case) or until it reaches the root process, crashing the program.

Do you have some particular issue?

Sep 01 '24 14:09 lpil

If this is the expected behavior then maybe I'm misunderstanding the documentation, or something specific about my implementation. It says here:

Start an actor from a given specification. If the actor’s init function returns an error or does not return within init_timeout then an error is returned.

Based on the documentation my assumption was that the code above wouldn't crash but instead would instead return Error(StatError). Maybe the language "returns an error" is a bit overloaded here and it would be clearer to re-word it?

Sep 01 '24 14:09 brettkolodny

Ah yes, a bug for sure. Thank you.

I presume we want it to not crash but I've not the time to sit down and think through the implications of that and how we would implement it.

I'm also thinking back to something @crowdhailer said years back, that perhaps init should not fail and starting an actor shouldn't return a result. Could be a good thing to think about now as we're looking at making some breaking changes to support naming.

Sep 01 '24 15:09 lpil

My opinion has not changed. But I don't have any new compelling arguments.

Sep 01 '24 16:09 CrowdHailer

What were your old arguments? If you can remember the details 😄

Sep 03 '24 13:09 lpil

I think that you should always be able to validate the initial state before calling an init. And you shouldn't rely on anther pid/service without you actor understanding failure in reaching that service. I think Async init was a problem. I dunno, I'd have to play again to make better arguments

Sep 03 '24 14:09 CrowdHailer

otp otp copied to clipboard

Actor's `init` taking longer than `init_timeout` causes entire program to crash

otp
otp copied to clipboard