transformerlab-app icon indicating copy to clipboard operation
transformerlab-app copied to clipboard

Timeout loading 8B models on AWS instance

Open dadmobile opened this issue 7 months ago • 0 comments

For some reason the Llama 3 and Llama 3.1 models can't load in to GPU before some timeout kicks in (2 minutes). I can see that there are 4 shards to loads and usually it takes just over 30s per shard...I once got it to work but most times it fails.

Can the timeout be changed? Why does it take so long? I can load the same model on my local computer in under 20 seconds.

dadmobile avatar Jul 24 '24 19:07 dadmobile