Hugging Face LLM Inference Container for Amazon SageMaker

Open adityajumde opened this issue 2 years ago • 2 comments

Tried following this in my sagemaker notebook instance (g5.48x.large). Unable to use this script for falcon-7b. Keep getting the error: Shard cannot start. The steps work well for falcon-40b and pythia-12b. What might be the issue? Q2: Can we follow the steps for our fine-tuned version of falcon-7b?

Jul 20 '23 23:07 adityajumde

cc: @philschmid

Aug 03 '23 19:08 Vaibhavs10

Faclon cannot be used on multigpu instances since the model has a odd number of heads.

Aug 03 '23 19:08 philschmid