snowstorm icon indicating copy to clipboard operation
snowstorm copied to clipboard

HOW TO RUN SNOWSTORM IN ECS CLUSTER?

Open Nareshsam95 opened this issue 2 years ago • 9 comments

can anyknow tell me how to run this snowstorm in ECS cluster? I have tried in many but my containers are not getting started in ECS cluster

Thanks in advance.

Nareshsam95 avatar Aug 24 '22 04:08 Nareshsam95

Which approach are you trying, Fargate or EC2 instances? An EC2 instance may be easier to debug. Could try starting one manually and connecting to check the logs? Ensure there is at least 8G of memory on the container.

Community input welcome on this one!

kaicode avatar Aug 24 '22 09:08 kaicode

i'm running in FARGATE

Nareshsam95 avatar Aug 24 '22 09:08 Nareshsam95

I would start by getting Elasticsearch 7.x running in Fargate. There are a few examples when I search, but they use Elasticsearch 8.x which is not compatible with Snowstorm.

Once you have Elasticsearch 7.x running you would need to make sure Snowstorm starts after the Elasticsearch API is up (usually on http://localhost:9200 ) otherwise Snowstorm startup will fail.

If you have any specific issues feel free to post the logs here and we will attempt to help you debug.

kaicode avatar Aug 30 '22 11:08 kaicode

We're trying to figure this out right now too. One big learning: Amazon adopted OpenSearch, but the OpenSearch service can still spin up an Elasticsearch cluster for you. Under Deployment type, enable "include older versions" and you'll be able to choose ES 7.x Screen Shot 2022-08-31 at 11 42 34 AM

tia-schung avatar Aug 31 '22 17:08 tia-schung

I would start by getting Elasticsearch 7.x running in Fargate. There are a few examples when I search, but they use Elasticsearch 8.x which is not compatible with Snowstorm.

Once you have Elasticsearch 7.x running you would need to make sure Snowstorm starts after the Elasticsearch API is up (usually on http://localhost:9200 ) otherwise Snowstorm startup will fail.

If you have any specific issues feel free to post the logs here and we will attempt to help you debug.

i have tried to run the elasticsearch in my ecs fargate cluster but elasticsearch container is get into exited state. The elasticsearch container is not running.

Nareshsam95 avatar Sep 01 '22 06:09 Nareshsam95

Are you able to capture any logs from Elasticsearch? It may be a disk space issue which could be fixed by configuration changes.

kaicode avatar Sep 01 '22 08:09 kaicode

yes i have seen the logs. vm issues came but i dont know how to add memory in Task Definitions

Nareshsam95 avatar Sep 01 '22 08:09 Nareshsam95

We didn't want to deal with running Elasticsearch ourselves, so we are running snowstorm in an ECS container talking to an AWS Opensearch instance running Elasticsearch 7.7. Here's the Dockerfile we're using:

FROM snomedinternational/snowstorm:7.9.3

USER snowstorm

ARG PORT=${Port}

EXPOSE $PORT

ENTRYPOINT ["java","-Xms2g","-Xmx3g","-jar","snowstorm.jar","--elasticsearch.urls=${OPENSEARCH_URL}", \
    "--elasticsearch.username=snowstorm", \
    "--elasticsearch.password=${OPENSEARCH_PASSWORD}" \
    ]

Snowstorm docs say you need at least 2g memory, but we had to bump it up to 3g to load data into elasticsearch. That's why max heap size is 3g instead of 4g.

Once that's running, you can load the data into elasticsearch by putting the snomed data files in s3, opening a shell on the container with aws ecs execute-command --region <your-region> --cluster <your-cluster> --task <task-id> --container <your-container> --command "/bin/sh" --interactive and running wget <s3 url> to copy the file onto the container. Then while still in the shell you can run all the curl commands in snowstorm documentation.

tia-schung avatar Feb 01 '23 16:02 tia-schung

We didn't want to deal with running Elasticsearch ourselves, so we are running snowstorm in an ECS container talking to an AWS Opensearch instance running Elasticsearch 7.7. Here's the Dockerfile we're using:

FROM snomedinternational/snowstorm:7.9.3

USER snowstorm

ARG PORT=${Port}

EXPOSE $PORT

ENTRYPOINT ["java","-Xms2g","-Xmx3g","-jar","snowstorm.jar","--elasticsearch.urls=${OPENSEARCH_URL}", \
    "--elasticsearch.username=snowstorm", \
    "--elasticsearch.password=${OPENSEARCH_PASSWORD}" \
    ]

Snowstorm docs say you need at least 2g memory, but we had to bump it up to 3g to load data into elasticsearch. That's why max heap size is 3g instead of 4g.

Once that's running, you can load the data into elasticsearch by putting the snomed data files in s3, opening a shell on the container with aws ecs execute-command --region <your-region> --cluster <your-cluster> --task <task-id> --container <your-container> --command "/bin/sh" --interactive and running wget <s3 url> to copy the file onto the container. Then while still in the shell you can run all the curl commands in snowstorm documentation.

what type of elasticsearch instance are you using?

matiasict avatar Nov 15 '23 21:11 matiasict