llama-stack
llama-stack copied to clipboard
Can't run llama after installation
System Info
M4 Pro MacBook Pro 16" on 15.2.
Information
- [X] The official example scripts
- [ ] My own modified scripts
🐛 Describe the bug
I simply can't run llama. That's all. I installed it via the instructions found here https://www.llama.com/llama-downloads/ , but nothing happens when entering "llama" or "llama-stack" into Terminal, just says it doesn't exist.
Error logs
Here's the entire history of the Terminal window, from install to trying to run it:
Last login: Wed Dec 11 19:52:53 on console jaden@PureSlate ~ % pip install llama-stack Defaulting to user installation because normal site-packages is not writeable Collecting llama-stack Downloading llama_stack-0.0.61-py3-none-any.whl (453 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 453.3/453.3 kB 5.2 MB/s eta 0:00:00 Collecting blobfile Downloading blobfile-3.0.0-py3-none-any.whl (75 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 75.4/75.4 kB 8.4 MB/s eta 0:00:00 Collecting fire Downloading fire-0.7.0.tar.gz (87 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 87.2/87.2 kB 9.1 MB/s eta 0:00:00 Preparing metadata (setup.py) ... done Collecting httpx Downloading httpx-0.28.1-py3-none-any.whl (73 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 73.5/73.5 kB 7.5 MB/s eta 0:00:00 Collecting huggingface-hub Downloading huggingface_hub-0.26.5-py3-none-any.whl (447 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 447.8/447.8 kB 32.7 MB/s eta 0:00:00 Collecting llama-models>=0.0.61 Downloading llama_models-0.0.61-py3-none-any.whl (1.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 27.9 MB/s eta 0:00:00 Collecting llama-stack-client>=0.0.61 Downloading llama_stack_client-0.0.61-py3-none-any.whl (291 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 291.1/291.1 kB 22.3 MB/s eta 0:00:00 Requirement already satisfied: prompt-toolkit in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from llama-stack) (3.0.38) Collecting python-dotenv Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB) Collecting pydantic>=2 Downloading pydantic-2.10.3-py3-none-any.whl (456 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 457.0/457.0 kB 25.5 MB/s eta 0:00:00 Requirement already satisfied: requests in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from llama-stack) (2.28.2) Collecting rich Downloading rich-13.9.4-py3-none-any.whl (242 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 242.4/242.4 kB 14.6 MB/s eta 0:00:00 Requirement already satisfied: setuptools in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from llama-stack) (65.5.0) Collecting termcolor Downloading termcolor-2.5.0-py3-none-any.whl (7.8 kB) Collecting PyYAML Downloading PyYAML-6.0.2-cp311-cp311-macosx_11_0_arm64.whl (172 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 172.0/172.0 kB 22.9 MB/s eta 0:00:00 Collecting jinja2 Downloading jinja2-3.1.4-py3-none-any.whl (133 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.3/133.3 kB 15.3 MB/s eta 0:00:00 Collecting tiktoken Downloading tiktoken-0.8.0-cp311-cp311-macosx_11_0_arm64.whl (982 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 982.4/982.4 kB 32.8 MB/s eta 0:00:00 Collecting Pillow Downloading pillow-11.0.0-cp311-cp311-macosx_11_0_arm64.whl (3.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 40.1 MB/s eta 0:00:00 Collecting anyio<5,>=3.5.0 Downloading anyio-4.7.0-py3-none-any.whl (93 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 93.1/93.1 kB 14.7 MB/s eta 0:00:00 Collecting click Downloading click-8.1.7-py3-none-any.whl (97 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.9/97.9 kB 13.2 MB/s eta 0:00:00 Collecting distro<2,>=1.7.0 Downloading distro-1.9.0-py3-none-any.whl (20 kB) Collecting pandas Downloading pandas-2.2.3-cp311-cp311-macosx_11_0_arm64.whl (11.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.3/11.3 MB 40.5 MB/s eta 0:00:00 Collecting pyaml Downloading pyaml-24.12.1-py3-none-any.whl (25 kB) Collecting sniffio Downloading sniffio-1.3.1-py3-none-any.whl (10 kB) Requirement already satisfied: tqdm in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from llama-stack-client>=0.0.61->llama-stack) (4.65.0) Collecting typing-extensions<5,>=4.7 Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB) Requirement already satisfied: certifi in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from httpx->llama-stack) (2022.12.7) Collecting httpcore==1.* Downloading httpcore-1.0.7-py3-none-any.whl (78 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.6/78.6 kB 4.2 MB/s eta 0:00:00 Requirement already satisfied: idna in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from httpx->llama-stack) (3.4) Collecting h11<0.15,>=0.13 Downloading h11-0.14.0-py3-none-any.whl (58 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.3/58.3 kB 8.3 MB/s eta 0:00:00 Collecting annotated-types>=0.6.0 Downloading annotated_types-0.7.0-py3-none-any.whl (13 kB) Collecting pydantic-core==2.27.1 Downloading pydantic_core-2.27.1-cp311-cp311-macosx_11_0_arm64.whl (1.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 35.7 MB/s eta 0:00:00 Collecting pycryptodomex>=3.8 Downloading pycryptodomex-3.21.0-cp36-abi3-macosx_10_9_universal2.whl (2.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 47.7 MB/s eta 0:00:00 Requirement already satisfied: urllib3<3,>=1.25.3 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from blobfile->llama-stack) (1.26.15) Collecting lxml>=4.9 Downloading lxml-5.3.0-cp311-cp311-macosx_10_9_universal2.whl (8.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.1/8.1 MB 46.1 MB/s eta 0:00:00 Collecting filelock>=3.0 Downloading filelock-3.16.1-py3-none-any.whl (16 kB) Collecting fsspec>=2023.5.0 Downloading fsspec-2024.10.0-py3-none-any.whl (179 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 179.6/179.6 kB 24.2 MB/s eta 0:00:00 Collecting packaging>=20.9 Downloading packaging-24.2-py3-none-any.whl (65 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.5/65.5 kB 10.0 MB/s eta 0:00:00 Requirement already satisfied: wcwidth in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from prompt-toolkit->llama-stack) (0.2.6) Requirement already satisfied: charset-normalizer<4,>=2 in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (from requests->llama-stack) (3.1.0) Collecting markdown-it-py>=2.2.0 Downloading markdown_it_py-3.0.0-py3-none-any.whl (87 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 87.5/87.5 kB 11.2 MB/s eta 0:00:00 Collecting pygments<3.0.0,>=2.13.0 Downloading pygments-2.18.0-py3-none-any.whl (1.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 34.2 MB/s eta 0:00:00 Collecting mdurl~=0.1 Downloading mdurl-0.1.2-py3-none-any.whl (10.0 kB) Collecting MarkupSafe>=2.0 Downloading MarkupSafe-3.0.2-cp311-cp311-macosx_11_0_arm64.whl (12 kB) Collecting numpy>=1.23.2 Downloading numpy-2.2.0-cp311-cp311-macosx_14_0_arm64.whl (5.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.4/5.4 MB 46.6 MB/s eta 0:00:00 Collecting python-dateutil>=2.8.2 Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.9/229.9 kB 23.3 MB/s eta 0:00:00 Collecting pytz>=2020.1 Downloading pytz-2024.2-py2.py3-none-any.whl (508 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 508.0/508.0 kB 24.2 MB/s eta 0:00:00 Collecting tzdata>=2022.7 Downloading tzdata-2024.2-py2.py3-none-any.whl (346 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 346.6/346.6 kB 20.1 MB/s eta 0:00:00 Collecting regex>=2022.1.18 Downloading regex-2024.11.6-cp311-cp311-macosx_11_0_arm64.whl (284 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 284.6/284.6 kB 14.4 MB/s eta 0:00:00 Collecting six>=1.5 Downloading six-1.17.0-py2.py3-none-any.whl (11 kB) Installing collected packages: pytz, tzdata, typing-extensions, termcolor, sniffio, six, regex, PyYAML, python-dotenv, pygments, pycryptodomex, Pillow, packaging, numpy, mdurl, MarkupSafe, lxml, h11, fsspec, filelock, distro, click, annotated-types, tiktoken, python-dateutil, pydantic-core, pyaml, markdown-it-py, jinja2, huggingface-hub, httpcore, fire, blobfile, anyio, rich, pydantic, pandas, httpx, llama-stack-client, llama-models, llama-stack WARNING: The script dotenv is installed in '/Users/jaden/Library/Python/3.11/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. WARNING: The script pygmentize is installed in '/Users/jaden/Library/Python/3.11/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. WARNING: The scripts f2py and numpy-config are installed in '/Users/jaden/Library/Python/3.11/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. WARNING: The script distro is installed in '/Users/jaden/Library/Python/3.11/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. WARNING: The script pyaml is installed in '/Users/jaden/Library/Python/3.11/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. WARNING: The script markdown-it is installed in '/Users/jaden/Library/Python/3.11/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. WARNING: The script huggingface-cli is installed in '/Users/jaden/Library/Python/3.11/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. DEPRECATION: fire is being installed using the legacy 'setup.py install' method, because it does not have a 'pyproject.toml' and the 'wheel' package is not installed. pip 23.1 will enforce this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion can be found at https://github.com/pypa/pip/issues/8559 Running setup.py install for fire ... done WARNING: The script httpx is installed in '/Users/jaden/Library/Python/3.11/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. WARNING: The script llama-stack-client is installed in '/Users/jaden/Library/Python/3.11/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. WARNING: The scripts example_chat_completion, example_text_completion, multimodal_example_chat_completion and multimodal_example_text_completion are installed in '/Users/jaden/Library/Python/3.11/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. WARNING: The scripts install-wheel-from-presigned and llama are installed in '/Users/jaden/Library/Python/3.11/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. Successfully installed MarkupSafe-3.0.2 Pillow-11.0.0 PyYAML-6.0.2 annotated-types-0.7.0 anyio-4.7.0 blobfile-3.0.0 click-8.1.7 distro-1.9.0 filelock-3.16.1 fire-0.7.0 fsspec-2024.10.0 h11-0.14.0 httpcore-1.0.7 httpx-0.28.1 huggingface-hub-0.26.5 jinja2-3.1.4 llama-models-0.0.61 llama-stack-0.0.61 llama-stack-client-0.0.61 lxml-5.3.0 markdown-it-py-3.0.0 mdurl-0.1.2 numpy-2.2.0 packaging-24.2 pandas-2.2.3 pyaml-24.12.1 pycryptodomex-3.21.0 pydantic-2.10.3 pydantic-core-2.27.1 pygments-2.18.0 python-dateutil-2.9.0.post0 python-dotenv-1.0.1 pytz-2024.2 regex-2024.11.6 rich-13.9.4 six-1.17.0 sniffio-1.3.1 termcolor-2.5.0 tiktoken-0.8.0 typing-extensions-4.12.2 tzdata-2024.2
[notice] A new release of pip is available: 23.0.1 -> 24.3.1
[notice] To update, run: pip install --upgrade pip
jaden@PureSlate ~ % pip install --upgrade pip
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: pip in /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages (23.0.1)
Collecting pip
Downloading pip-24.3.1-py3-none-any.whl (1.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 10.7 MB/s eta 0:00:00
Installing collected packages: pip
WARNING: The scripts pip, pip3 and pip3.11 are installed in '/Users/jaden/Library/Python/3.11/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-24.3.1
jaden@PureSlate ~ % llama model list
zsh: command not found: llama
jaden@PureSlate ~ % pip install llama
Defaulting to user installation because normal site-packages is not writeable
Collecting llama
Downloading llama-0.1.1.tar.gz (387 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... error
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [20 lines of output]
Traceback (most recent call last):
File "/Users/jaden/Library/Python/3.11/lib/python/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in
note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully. │ exit code: 1 ╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip. jaden@PureSlate ~ % llama model list zsh: command not found: llama jaden@PureSlate ~ % llama-stack zsh: command not found: llama-stack jaden@PureSlate ~ % bash bash-5.2$ llama model list bash: llama: command not found bash-5.2$
Expected behavior
I expected to be able to use llama.
Your Llama Stack installation failed so you can't run llama commands. Looks like you didn't run the requirements.
Did you follow the MacOS-specific instructions for the Llama Stack distribution that you want to use?
To clarify, Llama Stack is a system that sits as a layer above an inference server such as Ollama, vLLM, TGI, etc.
There is a meta-reference distribution that comes with a server if you don't already run one.
You have to follow the instructions for the setup you want to create using the correct distribution and the settings specific to your hardware and software (e.g. with or without GPU, local or remote inference server, etc.)
I strongly recommend that you read the docs here: https://llama-stack.readthedocs.io/en/latest/distributions/index.html
If you are after testing stuff locally on your machine, then the Ollama distribution is probably best for you. Then it's a choice of running the Docker version of Llama Stack or building locally under Conda or a Python venv.
I would add that this is a very technical system, it is being continuously developed and changed and there are lots of gotchas, so you will probably spend some time checking the Issues tickets here. There are much easier ways of running Llama models locally (e.g. Ollama) if that is all you're interested in. However, if you're developing scalable AI applications and need to be able to abstract away implementation details, then this is the right place for you.
Understood.