localstack-persist icon indicating copy to clipboard operation
localstack-persist copied to clipboard

Scripts run after localstack is "ready" are no longer working

Open adiberk opened this issue 2 years ago • 11 comments

I have scripts that get run that

  1. Create lambdas (if they aren't present already - I make function lookup call to check this)
  2. create sqs and sns related data
  3. create s3 buckets

In each of these scenarios I actually first check if what I am looking to create already exists. If yes then don't run again.
This worked fine with localstack pods as I first loaded the data from the pods and then ran these scripts. How can I ensure this works with your scripts as well?

ALso instead of persisting data through lifespan of the running image, would it be easier and make sense to override the "shutdown" logic of localstack and persist at moment of shutdown?

adiberk avatar Dec 14 '23 15:12 adiberk

What do you mean by "not working"? I just tested putting the following script in /etc/localstack/init/ready.d/hello.sh:

#!/bin/bash

echo "hello from ready script!"
awslocal s3api create-bucket --bucket testbucket

And once the container started, I could see the message in the container logs and it correctly created (and then persisted) the bucket.

ALso instead of persisting data through lifespan of the running image, would it be easier and make sense to override the "shutdown" logic of localstack and persist at moment of shutdown?

A potential issue with that method is that it would mean the resources wouldn't get persisted if the container doesn't exit cleanly

GREsau avatar Dec 14 '23 17:12 GREsau

@GREsau I guess my question is the following then When does loading from persistence happen? After or before my ready.d scripts? I ask because if I do things like check a function exists already before creating, I would need to make sure all that gets run after the persistence items are loaded.

adiberk avatar Dec 15 '23 14:12 adiberk

Loading persisted state is triggered by the on_infra_start hook, which happens before any services are started, and before the ready.d scripts are run (but may happen after boot.d and start.d scripts)

So you should be fine to check the existence of resources in your ready.d scripts 🙂

GREsau avatar Dec 19 '23 23:12 GREsau

@GREsau Hi so I wanted circle back on this. It seems the timeout is happening specifically in relation to awslocal lambda calls in my ready script. Other awslocal commands seem to work fine though. It only happens AFTER the first time I startup localstack without having anything persisted Once I shutdown and restart, this issue begins to happen. It seems the "lambda" service takes a very long time to startup. I will note that the lambdas I create are hot reloadable if that information helps

The result is a request taking a super long time and then finally timing out. This seems to only happen when I use the persist library awslocal lambda list-functions Read timeout on endpoint URL: "http://localhost:4566/2015-03-31/functions/"

I am using FROM gresau/localstack-persist:2.3.2

Here is my localstack compose

  localstack:
    container_name: localstack
    build:
      context: .
      dockerfile: localstack.Dockerfile
    ports:
      - "4566:4566"
      - "8055:8080"
    healthcheck:
      test: awslocal sns list-topics && awslocal sqs list-queues
      interval: 7s
      timeout: 10s
      retries: 10
    environment:
      - DOCKER_HOST=unix:///var/run/docker.sock
      - SERVICES=s3,sns,sqs,events,lambda,logs
      - AWS_DEFAULT_REGION=us-east-1
      - AWS_ACCESS_KEY_ID=1234
      - AWS_SECRET_ACCESS_KEY=1234
      - AWS_ACCOUNT_ID=000000000000
      # - LOCALSTACK_API_KEY=
      - LS_LOG=error
      - ACTIVATE_PRO=0
      # Enable this to turn off persistence if changing resources (so that they are recreated on restart)
      # - PERSIST_DEFAULT=0
      # To Enable persist on specific services
      # - PERSIST_S3=1
      - ROOT=$PWD
      - USE_LOCAL_S3=${USE_LOCAL_S3:-true}
    env_file:
      - .env
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock'
      - ./bootstrap_localstack/lambdas:/lambdas
      - ./bootstrap_localstack/on_localstack_ready:/etc/localstack/init/ready.d
      - ./bootstrap_localstack/cors.json:/cors.json
      - ./persisted-data:/persisted-data"
    networks:
      default:
        aliases:
          - localhost.localstack.cloud

adiberk avatar Feb 21 '24 15:02 adiberk

@GREsau Last thing It seems after 15-20 minutes the lambda "service" starts up and the the previously saved lambdas can be seen. But it seems strange that it would take so long

adiberk avatar Feb 21 '24 16:02 adiberk

Are you able to share a full reproducible example e.g. including the lambda code and instructions on how they're deployed? That would make it much easier to diagnose the problem

GREsau avatar Feb 21 '24 16:02 GREsau

Hm that will be a little complicated what I can tell you is that this is deployed with python.311 And it is a basic forwarder lambda that forwards messages from cron job to fifo queue and as you can see it is hotreloaded which is a feature provided by localstack Each lambda that gets "deployed" has their own "python virtual env". Since it is hotreloaded it doesn't need to be zipped etc.

localstack | { localstack | "FunctionName": "cron_event_forwarder", localstack | "FunctionArn": "arn:aws:lambda:us-east-1:000000000000:function:cron_event_forwarder", localstack | "Runtime": "python3.11", localstack | "Role": "arn:aws:iam::000000000000:role/lambda-role", localstack | "Handler": "forward_event_to_sns.lambda_handler", localstack | "CodeSize": 0, localstack | "Description": "", localstack | "Timeout": 10, localstack | "MemorySize": 128, localstack | "LastModified": "2024-02-21T16:01:42.997714+0000", localstack | "CodeSha256": "hot-reloading-hash-not-available", localstack | "Version": "$LATEST", localstack | "Environment": { localstack | "Variables": { localstack | "LEV_ENV": "dev", localstack | "AWS_REGION_NAME": "us-east-1", localstack | "AWS_ACCOUNT_ID": "000000000000", localstack | "SNS_TOPIC_NAME": "lev-app-dev.fifo", localstack | "PYTHONPATH": "env/lib/Python3.11/site-packages" localstack | } localstack | }, localstack | "TracingConfig": { localstack | "Mode": "PassThrough" localstack | }, localstack | "RevisionId": "4328ca97-c8e3-4f15-8c1b-14204f6f6525", localstack | "State": "Pending", localstack | "StateReason": "The function is being created.", localstack | "StateReasonCode": "Creating", localstack | "PackageType": "Zip", localstack | "Architectures": [ localstack | "x86_64" localstack | ], localstack | "EphemeralStorage": { localstack | "Size": 512 localstack | }, localstack | "SnapStart": { localstack | "ApplyOn": "None", localstack | "OptimizationStatus": "Off" localstack | }, localstack | "RuntimeVersionConfig": { localstack | "RuntimeVersionArn": "arn:aws:lambda:us-east-1::runtime:8eeff65f6809a3ce81507fe733fe09b835899b99481ba22fd75b5a7338290ec1" localstack | } localstack | }

adiberk avatar Feb 21 '24 18:02 adiberk

@GREsau Even more interesting If I use localstack 3.1.0 instead of 2.3.2. It works perfectly as expected! It seems like somehow the lambda "service" in 2.3.2 isn't spinning up right away? Or maybe recreating the saved lambda functions is halting the lambda service? Just can't use latest version of localstack yet and have to stay on 2.3.2 :(

adiberk avatar Feb 21 '24 19:02 adiberk

Not sure if this helps, but in the json files I don't see a "backend" file for lambdas when this happens. Only store.json. Other services have that file Screenshot 2024-02-21 at 3 16 01 PM

adiberk avatar Feb 21 '24 20:02 adiberk

Just to add to this It seems to get stuck on 1 part of my ready script FUNCTION_LIST="$(awslocal lambda list-functions)"

I am wondering if there is some sort of race condition happening where it is loading lambda data and this api call is getting stalled? Basically somehow this statement results stalling my code

If I run awslocal --verison - i get no issues

Please note all of this happens after the FIRST build and start. Basically after everything gets persisted

adiberk avatar Sep 03 '24 16:09 adiberk

I'm afraid without a minimal reproducible example, I can't really spend any time looking into what might be going wrong - especially if it's only happening on a previous version

GREsau avatar Sep 03 '24 18:09 GREsau