transformerlab-app
transformerlab-app copied to clipboard
Timeout loading 8B models on AWS instance
For some reason the Llama 3 and Llama 3.1 models can't load in to GPU before some timeout kicks in (2 minutes). I can see that there are 4 shards to loads and usually it takes just over 30s per shard...I once got it to work but most times it fails.
Can the timeout be changed? Why does it take so long? I can load the same model on my local computer in under 20 seconds.