orleans icon indicating copy to clipboard operation
orleans copied to clipboard

How to avoid "Cannot create a local object reference from a grain"

Open raptor2101 opened this issue 1 year ago • 6 comments

TL;DR: If GrainFactory throws a Exception during a CreateObjectReference call, it should give a easy possibility to access the GrainContext/GrainReference to avoid the Exception.

Context: we are integrating Orleans into a application via repositories and factories rather than basing the whole application on it. That said, the BL-objects are not aware of the fact, that they are hosted within a Grain or not. Everything works nice except the CreateObjectReference-Calls causes some trouble.

Within a Silo there are Classes that operate as an EventBus and therefore are shared. They are accessed from within a Grain(-call) but not bound to the lifecycle of that Grain. So it is needed, that these classes act as GrainObserver regardless if the Grain. that initial created it, is still alive or not.

We currently solve the problem by "exploiting" the fact, that RuntimeContext._threadLocalContext is ThreadStatic. Via a Task.Run the "CreateObjectReference" are performed within a another ThreadContext and passes without Exception and everything work like expected. (and i'm aware that this is decoupled of the calling grain!)

It looks just a little nasty.

raptor2101 avatar Jun 22 '23 13:06 raptor2101

If GrainFactory throws a Exception during a CreateObjectReference call, it should give a easy possibility to access the GrainContext/GrainReference to avoid the Exception.

I'm not yet convinced that IGrainFactory.CreateObjectReference should allow easy access to the current IGrainContext, so if you have supporting arguments, I am interested to hear them. The reason it doesn't allow you to call it from within a grain today is that it usually indicates an error in your code. Accessing the internals of a grain from outside of the grain is not allowed because it is unsafe.

Would you mind showing some code to demonstrate the use case? Your solution sounds ok to me, i.e, explicitly escaping the grain context to do this.

ReubenBond avatar Jun 22 '23 17:06 ReubenBond

Here is an simplified example code

class SomeFactory:ISomeFactory
{
  private readonly IGrainFactory _grainFactory;
  public SomeRepository(IGrainFactory grainFactory)
  {
    _grainFactory = grainFactory;
  }

  
  public Task<ISomeInterface> CreateSomething(String tenant, Guid primaryKey)
  {
    var instance = new SomeComplexClass();
  
    var grain = _grainFactory.GetGrain<ICentralGrain>(primaryKey, tenant);
    var receiver = _grainFactory.CreateObjectReference<IGrainReceiver>(instance);
    
    await grain.Subscribe(receiver);
    
    return instance;
  }
}

This Factory is created vi DI at startup and CreateSomething is called throughout the application. The instance is stored in a global accessible repository.

The thing is: CreateSomething can be called from within a Controller Call (HTTP CallContext) or from within a Grain (GrainCallContext). In the first case, everything works, in the second case the exception is thrown.

To get around this, if done this:

class SomeFactory:ISomeFactory
{
  private readonly IGrainFactory _grainFactory;
  public SomeRepository(IGrainFactory grainFactory)
  {
    _grainFactory = grainFactory;
  }

  
  public Task<ISomeInterface> CreateSomething(String tenant, Guid primaryKey)
  {
    var instance = new SomeComplexClass();
  
    var grain = _grainFactory.GetGrain<ICentralGrain>(primaryKey, tenant)
    Task.Run(()=>{
      var receiver = _grainFactory.CreateObjectReference<IGrainReceiver>(instance)    
      await grain.Subscribe(receiver);
    });
    
    return instance;
  }
}

It is Functional but looks nasty, like a dirty hack.

I would prefer something like this

if(GrainContext.Current != null)
{
    var receiver = GrainContext.Current.AsReference<IGrainReceiver>()
}
else
{
  var receiver = _grainFactory.CreateObjectReference<IGrainReceiver>(instance)    
}

raptor2101 avatar Jun 24 '23 19:06 raptor2101

I ran into the same bug.

I'm not yet convinced that IGrainFactory.CreateObjectReference should allow easy access to the current IGrainContext, so if you have supporting arguments, I am interested to hear them

@ReubenBond I'm not sure why CreateObjectReference would require access to IGrainContext, it seems to be exactly the opposite: it does not require it because it works only when the context is null but panics when it's not null (from Orleans 3.x sources):

public GrainReference CreateObjectReference(IAddressable obj, IGrainMethodInvoker invoker)
        {
            if (RuntimeContext.CurrentGrainContext is null) return this.HostedClient.CreateObjectReference(obj, invoker);
            throw new InvalidOperationException("Cannot create a local object reference from a grain.");
        }

Means the only reason exception is there is a bias that it may indicate a bug, right? Are you just wondering why anybody would pass an IGrainObserver from within a grain call to another grain?

If so, I can give you an example below (will make a separate comment because it's an off-topic really)

DunetsNM avatar Dec 01 '23 08:12 DunetsNM

OK here's example of why it might be useful to create an observer within a grain call and pass it to another grain

Here's some code excerpts (F#)

type IUntypedGrainCallbackHandler =
    inherit IGrainObserver
    abstract member OnCallback: encodedPayload: byte[] -> unit

type IUntypedGrain =
    inherit IGrainWithGuidCompoundKey
...
    abstract member Invoke:
        encodedOp: byte[]
        -> (* encodedResult *) Task<byte[]>

    abstract member InvokeWithCallback:
        encodedOp: byte[]
        -> callbackHandler: IUntypedSubjectGrainCallbackHandler
        -> (* encodedResult *) Task<byte[]>

This is very abstract grain interface which going to be used in many different Orleans projects/clusters to communicate with each other natively. Goal is to make cross-cluster Grain interface as stable as possible so it never needs to be evolved (which would require downtime and deployment of all clusters). Here I completely side-step Orleans serialization and instead pass byte arrays in both directions (all projects follow a certain protocol to encode and decode typed data, we use our own serialization instead of relying on Orleans evolution capabilities because we find latter too limited).

So that's the interface to communicate between Orleans clusters. However within each Orleans cluster we use different typed grains, UntypedGrain is just a proxy stateless worker that needs to dispatch any operation either without observer (Invoke) or with observer (InvokeWithCallback), it decodes the op and passes it to a proper typed grain, then encodes response and passes it up to client (yes I know it potentially means an extra silo hop, this is a price I'm happy to pay).

In case of InvokeWithCallback the observer also acts as a proxy that encodes observed typed value from another typed observer into a byte array and notifies client in other cluster, that's the idea

DunetsNM avatar Dec 01 '23 08:12 DunetsNM

I’m having issues with this exception when calling CreateObjectReference in a co-hosted process when it is unambiguously NOT being called from a grain. Is it possible non-grain work is being scheduled on a grain thread without the context being cleared?

Edit: I’ve logged out thread IDs in the debugger and it looks like this is what’s happening. I’m finding I can trigger this 100% if I call CreateObjectReference in sync context immediately after a grain call. Adding any sort of async suspension, eg 1ms sleep, will suppress it - I presume there’s some sort of async cleanup? I’m on dotnet 8.0.201, Orleans 8.0, macOS 14.4, FSharp.

samritchie avatar Mar 19 '24 07:03 samritchie

Is it possible non-grain work is being scheduled on a grain thread without the context being cleared?

It's possible - that would be a bug. Could you try to reproduce this on 8.1.0-preview3? If it reproduces, could you share a repro?

ReubenBond avatar Mar 21 '24 13:03 ReubenBond