barrymac comments

Results 6 comments of


                                            barrymac

llama.cpp thread is not running.

I think I have a related issue. I am trying to run models on a CPU only server with 64 threads available and text generation is very slow and only...

llama.cpp thread is not running.

ok, my own ignorance is at fault, I found that [a vicuna ggml](https://huggingface.co/eachadea/legacy-ggml-vicuna-13b-4bit) model will fully saturate all 64 CPU cores. I will try other GGML models to see how...

dir not created on node

On the host the volume is not created consistent with the kubelet logs ls -l /var/lib/storageos/volumes/ total 0

feature suggestion: Ability to download files from the workspace

Thanks, there's also a workaround I'm using at this time, which is to ask it to email specific files. However, I find that I need to keep the ai settings...

Support Volta architecture

I have a Dell C4140 server with 4x Tesla V100 SXM2 32GB NVLink GPUS and would love to see this setup supported in future!

fail to load Mixtral-8x7B-v0.1-GPTQ

This works for me now with the latest version. No other pip modules needed to be installed. The model took about 1003 seconds to load first time on my 4x...