localstack-persist icon indicating copy to clipboard operation
localstack-persist copied to clipboard

After container restart, eventbridge target fails

Open Kieranties opened this issue 7 months ago • 2 comments

I've been testing this project with sqs queues and have had no issues, it's been great!

I'm now updating my infrastructure to include eventbridge. In principle I have the following setup: EventBridge -publishes-> Sqs <-pulls- Some Service

My infrastructure is deployed via terraform.

  • I start the container with a volume mount pointing at the /persisted-data folder
  • I apply my terraform successfully
  • I can send messages to eventbridge successfully, and see them flow to my service to consumer them (the following .http will publish a message)
###
POST https://localhost.localstack.cloud:4566
X-Amz-Target: AWSEvents.PutEvents
Content-Type: application/x-amz-json-1.1

{
    "Entries": [
        {
            "Source": "test.example",
            "DetailType": "example.user.created",
            "Detail": "{\"id\":\"{{$guid}}\",\"name\":\"SOME_NAME\",\"age\":\"{{$randomInt}}\"}",
            "EventBusName": "default"
        }
    ]
} 
  • I can see the messages in localstack logs for persistence
  • I stop the container
  • I start the container
  • I can see messages that persistence is loaded
  • I can see in the localstack dashboard
    • The event rule still exists
    • The event rule target still exists
    • The sqs still exists
  • I then send a message as before, and the following is logged (when in debug)
2025-05-15T15:35:24.567 ERROR --- [et.reactor-1] l.aws.handlers.logging     : exception during call chain
Traceback (most recent call last):
  File "/opt/code/localstack/.venv/lib/python3.11/site-packages/rolo/gateway/chain.py", line 166, in handle
    handler(self, self.context, response)
  File "/opt/code/localstack/localstack-core/localstack/aws/handlers/service.py", line 113, in __call__
    handler(chain, context, response)
  File "/opt/code/localstack/localstack-core/localstack/aws/handlers/service.py", line 83, in __call__
    skeleton_response = self.skeleton.invoke(context)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/code/localstack/localstack-core/localstack/aws/skeleton.py", line 154, in invoke
    return self.dispatch_request(serializer, context, instance)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/code/localstack/localstack-core/localstack/aws/skeleton.py", line 168, in dispatch_request
    result = handler(context, instance) or {}
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/code/localstack/localstack-core/localstack/aws/skeleton.py", line 118, in __call__
    return self.fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/code/localstack/localstack-core/localstack/aws/api/core.py", line 163, in operation_marker
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/code/localstack/localstack-core/localstack/services/events/provider.py", line 1336, in put_events
    entries, failed_entry_count = self._process_entries(context, entries)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/code/localstack/localstack-core/localstack/services/events/provider.py", line 1944, in _process_entries
    self._process_entry(event, processed_entries, failed_entry_count, context)
  File "/opt/code/localstack/localstack-core/localstack/services/events/provider.py", line 1993, in _process_entry
    self._process_rules(rule, region, account_id, event_formatted)
  File "/opt/code/localstack/localstack-core/localstack/services/events/provider.py", line 2040, in _process_rules
    target_sender = self._target_sender_store[target_arn]
                    ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
KeyError: 'arn:aws:sqs:us-east-1:000000000000:example_user_created'
2025-05-15T15:35:24.572  INFO --- [et.reactor-1] localstack.request.aws     : AWS events.PutEvents => 500 (InternalError)

Additional

  • During the restart there are no errors relating to loading the state data
2025-05-15T16:35:23.840  INFO --- [  MainThread] localstack_persist.state   : Loading persisted state of all services...
2025-05-15T16:35:23.840  INFO --- [  MainThread] localstack_persist.state   : Loading persisted state of service sts...
2025-05-15T16:35:23.890  INFO --- [  MainThread] localstack_persist.state   : Loading persisted state of service cloudwatch...
2025-05-15T16:35:23.964  INFO --- [  MainThread] localstack_persist.state   : Loading persisted state of service iam...
2025-05-15T16:35:24.232  INFO --- [  MainThread] localstack_persist.state   : Loading persisted state of service events...
2025-05-15T16:35:24.382  INFO --- [  MainThread] localstack_persist.state   : Loading persisted state of service sqs...
  • I considered that when stopping the container the issue may be that it was not given long enough to persist data. I increased the timeout to 30 seconds and ensured it shutdown gracefully. This did not resolve the issue
  • I changed the persistent model to binary. This did not fix the issue but did show perhaps a different problem
2025-05-15T16:24:46.799  WARN --- [ead-4 (_run)] l.s.pickle.serializer      : Error while pickling state <class 'localstack.services.stores.AccountRegionBundle'>, falling back to slower 'dill' pickler
Traceback (most recent call last):
  File "/opt/code/localstack/.venv/lib/python3.11/site-packages/localstack_persist/serialization/pickle/serializer.py", line 34, in serialize
    pickler.dump(data)
AttributeError: Can't pickle local object 'SqsQueue.default_attributes.<locals>.<lambda>'

Kieranties avatar May 15 '25 16:05 Kieranties

I think this is related to the new EventBridge provider in LocalStack 4 (see https://github.com/localstack/localstack/releases/tag/v4.0.0), which seems to handle persistence a bit differently. For now, you may be able to work-around it by switching to the old provider with the environment variable PROVIDER_OVERRIDE_EVENTS=v1 - but that's likely to be removed in a future version, so I'll leave this issue open until I can get a proper fix in

GREsau avatar May 15 '25 21:05 GREsau

@GREsau Spot on. After updating the environment to use the legacy provider, the restart is successful. This is sufficient for me now, but any further updates would be greatly appreciated.

Kieranties avatar May 16 '25 08:05 Kieranties