text-generation-inference Falcon-40b-instruct deployment in SageMaker fails when using serial inference pipeline

Falcon-40b-instruct deployment in SageMaker fails when using serial inference pipeline

Open omrigut1 opened this issue 2 years ago • 4 comments

System Info

Hi,

Deployment of Falcon-40b-instruct as SageMaker endpoint worked well for me when following this tutorial. However when I try to deploy the container as part of serial inference pipeline I'm getting the following errors:

#033[2m2023-06-16T12:57:41.398047Z#033[0m #033[31mERROR#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Shard 3 failed to start:
[W socket.cpp:601] [c10d] The client socket has failed to connect to [localhost]:29500 (errno: 99 - Cannot assign requested address).
You are using a model of type RefinedWeb to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
#033[2m2023-06-16T12:57:41.398097Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Shutting down shards
#033[2m2023-06-16T12:57:41.432637Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Shard 2 terminated
#033[2m2023-06-16T12:57:42.131047Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Shard 0 terminated
Error: ShardCannotStart

Any idea what could cause this and how it can get resolved?

Thanks!

Information

[X] Docker
[ ] The CLI directly

Tasks

[X] An officially supported command
[ ] My own modifications

Reproduction

Follow the steps in this manual: https://samuelabiodun.medium.com/how-to-deploy-a-pytorch-model-on-sagemaker-aa9a38a277b6
Deploy the model as part of serial inference pipeline. https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipelines.html
Model deployed fails with the errors presented above.

Expected behavior

Successful deployment.

Jun 16 '23 22:06 omrigut1

The client socket has failed to connect to [localhost]:29500 (errno: 99 - Cannot assign requested address)

This seems like the issue. Could it be that something is already running on that port for instance ?

Jun 19 '23 09:06 Narsil

Yes - it is possible. Is there a way to override the default port value (29500)?

Jun 19 '23 10:06 omrigut1

Yes: use the --master-port arg or the MASTER_PORT env var.

Jun 19 '23 10:06 OlivierDehaene

thx. I solved the port issue but the shards still can't start and I'm still seeing the following message in the logs:

You are using a model of type RefinedWeb to instantiate a model of type . This is not supported for all configurations of models and can yield

Jun 19 '23 10:06 omrigut1

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Jul 21 '24 01:07 github-actions[bot]

text-generation-inference text-generation-inference copied to clipboard

Falcon-40b-instruct deployment in SageMaker fails when using serial inference pipeline

System Info

Information

Tasks

Reproduction

Expected behavior

text-generation-inference
text-generation-inference copied to clipboard