barrymac

Results 6 comments of barrymac

I think I have a related issue. I am trying to run models on a CPU only server with 64 threads available and text generation is very slow and only...

ok, my own ignorance is at fault, I found that [a vicuna ggml](https://huggingface.co/eachadea/legacy-ggml-vicuna-13b-4bit) model will fully saturate all 64 CPU cores. I will try other GGML models to see how...

On the host the volume is not created consistent with the kubelet logs ls -l /var/lib/storageos/volumes/ total 0

Thanks, there's also a workaround I'm using at this time, which is to ask it to email specific files. However, I find that I need to keep the ai settings...

I have a Dell C4140 server with 4x Tesla V100 SXM2 32GB NVLink GPUS and would love to see this setup supported in future!

This works for me now with the latest version. No other pip modules needed to be installed. The model took about 1003 seconds to load first time on my 4x...