llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

Composable building blocks to build Llama Apps

Results 360 llama-stack issues
Sort by recently updated
recently updated
newest added

Hi, I have installed llama cli, but getting below error for any command. I installed it with `pip3 install llama-stack`. This prevents further usage of llama-stack, could you please guide...

Many developers will be surprised to learn that `requests` library calls do not include timeouts by default. This means that an attempted request could hang indefinitely if no connection is...

CLA Signed

If I run build for ollama with docker, after configuring and running the docker image, the docker image still looks for GPU support and fails. Steps to recreate. llama stack...

In the previous design, the server endpoint at the top-most level extracted the headers from the request and set provider data (e.g., private keys) that the implementations could retrieve using...

CLA Signed

The current implementation of local means no sharding/tensor parallelism, etc, and refuses to work on my dual 4090 setup. How do I enable multi gpu, or how do I enable...

Support for Bedrock inference providers has been applied with the following merges: https://github.com/meta-llama/llama-stack/commit/95abbf576b4b078e72b779f534cbaf696e30ecab However, it was overwritten in the next merge. https://github.com/meta-llama/llama-stack/commit/56aed59eb4c9915676c6fc7aac009dad97e7ead2 As a result, Bedrock is not displayed as...

CLA Signed

In this file image is not show https://github.com/meta-llama/llama-stack/blob/main/docs/cli_reference.md ![Screenshot_20240930_133446_Chrome](https://github.com/user-attachments/assets/c2f82d55-ea67-41e6-a3b5-3fd8857c1c46)

**Why this PR** We want to add [Runpod](https://www.runpod.io/) as remote inference provider for Llama-stack. [Runpod](https://www.runpod.io/) endpoints are OpenAI Compatible, hence it's recommended to use it with Runpod model serving endpoints....

CLA Signed

Can someone help me understand how the is the context being tracked for agent turn create API, or for the inference chat completion API? I want to understand how its...

Run client failed by ``` $ python -m llama_stack.apis.inference.client localhost 11434 User>hello world, write me a 2 sentence poem about the moon Error: HTTP 404 404 page not found ```...