blog
blog copied to clipboard
Hugging Face LLM Inference Container for Amazon SageMaker
Tried following this in my sagemaker notebook instance (g5.48x.large). Unable to use this script for falcon-7b. Keep getting the error: Shard cannot start. The steps work well for falcon-40b and pythia-12b. What might be the issue? Q2: Can we follow the steps for our fine-tuned version of falcon-7b?
cc: @philschmid
Faclon cannot be used on multigpu instances since the model has a odd number of heads.