dotnet-sdk icon indicating copy to clipboard operation
dotnet-sdk copied to clipboard

PostreSQL State Store Issue with Actor Reminders

Open sregdorr opened this issue 4 years ago • 2 comments

Expected Behavior

I have an Actor that simulates a fax job that takes some time to complete and can end with either a successful or failed fax transmission. After creating the faxJob and receiving a jobId from the fax vendor, I kick off a reminder that polls the fax vendor every 10 seconds for fax completion. After the fax status returns as complete, I unregister the timer.

Actual Behavior

With the redis state store, everything works beautifully. When I switched to using the postgresql state store, I receive the following error every time when trying to unregister the reminder:

Dapr.DaprApiException: error deleting actor reminder: possible etag mismatch. error from state store
         at Dapr.Actors.DaprHttpInteractor.SendAsyncHandleUnsuccessfulResponse(Func`1 requestFunc, String relativeUri, CancellationToken cancellationToken)
         at Dapr.Actors.DaprHttpInteractor.SendAsync(Func`1 requestFunc, String relativeUri, CancellationToken cancellationToken)
         at Dapr.Actors.Runtime.DefaultActorTimerManager.UnregisterReminderAsync(ActorReminderToken reminder)
         at FaxService.App.Actors.ServiceFaxJobManager.FaxJobManager.HandleFaxJobStatus(String status, String msg)

I also receive the following log from the dapr sidecar:

level=info msg="reminder StatusReminder with parameters: dueTime: 2021-09-28T11:31:13-05:00, period: 0h0m10s0ms, data:  has been deleted." app_id=fax-service instance=RRODGERS-7080 scope=dapr.runtime.actor type=log ver=1.4.2

Checking the database reveals that the reminder has in-fact NOT been deleted:

FaxJobManager||6d8528da-a9a3-43e5-b5d3-a2498815758f||StatusReminder,"{""lastFiredTime"":""2021-09-28T11:31:13-05:00"",""repetitionLeft"":-1}",2021-09-28 16:31:29.878381 +00:00,

Please Note: When invoking the reminder deletion manually i.e. DELETE http://localhost:3503/v1.0/actors/FaxJobManager/<faxJobId>/reminders/StatusReminder it works just fine

Steps to Reproduce the Problem

Here are the methods that handle the reminder:

private async Task StartPollingForCompletion()
{
    if (_activeTimers.Contains(StatusReminder))
    {
        Logger.LogWarning("Grain for FaxJobId: {Id} is already " +
                          "polling for completion", ActorId);
    }

    var period = TimeSpan.FromSeconds(10);
    await RegisterReminderAsync(
        "StatusReminder",
        null,
        period,
        period);
}

private async Task TimerTick()
{
    Logger.LogInformation(
        "Checking FaxJob {Id} for completion",
        ActorId);
    var response = await _faxingService.CheckStatus(_faxJobState.JobId!);

    await HandleFaxJobStatus(response.Status, response.ErrorMessage);
}

private async Task HandleFaxJobStatus(string status, string? msg)
{
    switch (status)
    {
        case "success":
            ...
            break;
        case "failure":
            ...
            break;
        default:
            Logger.LogInformation("FaxJobId {Id} is still pending...", ActorId);
            return;
    }

    await UnregisterReminderAsync("StatusReminder");      //  <----- This is where i receive the error
    _activeTimers = _activeTimers.Where(x => x != StatusReminder).ToList();
}

public Task ReceiveReminderAsync(string reminderName, byte[] state, TimeSpan dueTime, TimeSpan period)
    => reminderName switch
    {
        StatusReminder => TimerTick(),
        _ => Task.CompletedTask
    };

My state store is configure thusly:

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: statestore
spec:
  type: state.postgresql
  version: v1
  metadata:
    - name: connectionString
      value: "host=... user=... password=... port=5432 connect_timeout=10 database=dapr_test"
    - name: actorStateStore
      value: "true"

Setup

  • Dapr.AspNetCore v1.4.0 (tested with 1.3.0 as well)
  • Dapr.Actors.AspNetCore v1.4.0 (tested with 1.3.0 as well)
  • dotnet v5.0.301
  • dapr environment: Self-Hosted

Release Note

RELEASE NOTE:

sregdorr avatar Sep 28 '21 17:09 sregdorr

I am wondering if this is not just a dotnet-sdk but a more general issue affecting all SDKs. In which case, this issue should probably be moved to the dapr/dapr repo.

fbridger avatar Oct 02 '21 13:10 fbridger

@halspang - can you try to repro. I suspect we'll need to move this to dapr/components-contrib or dapr/dapr eventually, but we should scope it out first.

rynowak avatar Oct 19 '21 20:10 rynowak

A very delayed response, but this was an issue with the runtime that was actually fixed in 1.9.

The issue being that the runtime was trying to validate the reminder's remaining invocations/ttl and would actually cause it to be replaced at the end of the call. Since you were removing it in the method itself, it would hit an etag error and replace the reminder.

Closing as this has been fixed.

halspang avatar Oct 28 '22 21:10 halspang