ShadiCopty
ShadiCopty
What is the recommended minimum system spec to successfully run llama stack's default model in your quick start? I've been trying different configurations without much success.
It was the quantization that's causing errors -- closing this.
Thank you Itime-ren! Would be nice to have this instead of (e.g, https://....). Also, curious why I didn't need to go through this process with ollama (download was much more...
Leaving open for the suggestions above
Absolutely: Model: Llama3.1-8B-Instruct [run.yaml.zip](https://github.com/user-attachments/files/17329799/run.yaml.zip) Removing the fp8 gets this stack to work. let me know if you need more info re the system.
/home/paperspace/anaconda3/envs/llamastack-mylocal/lib/python3.10/site-packages/torch/__init__.py:1145: UserWarning: torch.set_default_tensor_type() is deprecated as of PyTorch 2.1, please use torch.set_default_dtype() and torch.set_default_device() as alternatives. (Triggered internally at ../torch/csrc/tensor/python_tensor.cpp:432.) _C._set_default_tensor_type(t) E1023 03:48:27.577000 3035 anaconda3/envs/llamastack-mylocal/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:732] failed (exitcode: -9) local_rank: 0...
@ashwinb still failing, I removed all of the old installation to be sure, and am using the reference-meta-quantized implementation with fp8.