llama-stack issues

add Weaviate memory adapter

3

### Why this PR We want to setup Weaviate as a remote vector db provider for llama-stack. ### What is in the PR - Add in Weaviate memory adapter to...

zainhas

CLA Signed

Adding Together as safety API provider

2

For testing, brought up the local instance of the llama stack and ran few safety queries with input prompt to check. Verified that the output looks as expected.

yogishbaliga

CLA Signed

Quick example illustrating `get_request_provider_data`

ashwinb

CLA Signed

We want to setup Databricks as a remote inference provider for llama-stack. [Databricks Foundation Model APIs](https://docs.databricks.com/en/machine-learning/foundation-models/index.html) are OpenAI compatible and we suggest using the [OpenAI client](https://docs.databricks.com/en/machine-learning/model-serving/score-foundation-models.html) to query Databricks model...

prithu-dasgupta

CLA Signed

Inference Failed Because of '500 Internal Server Error'

5

After launching the distribution server by `"llama distribution start --name local-llama-8b --port 5000 --disable-ipv6 "`, running any inference example, for example `"python examples/scripts/vacation.py localhost 5000 --disable-safety"` will give the following...

dawenxi-007

FP8 Quantization Does Not Work

1

Trying to run inference with FP8 quantization, and got the following error: ``` Configuring API surface: inference Enter value for model (existing: Meta-Llama3.1-8B-Instruct) (required): Meta-Llama3.1-8B-Instruct Enter value for quantization (optional):...

dawenxi-007

Checkpoint Cannot Be Found For Llama 405B Model

1

Trying to run inference with FP8 version of Llama 3.1 405B model (Meta-Llama3.1-405B-Instruct). The model was downloaded with `llama download --source huggingface --model-id Meta-Llama3.1-405B-Instruct --hf-token TOKEN`. However, the command `llama...

dawenxi-007

Colons in filenames when using llama download make them incompatible with windows.

Describe the bug The model ID for several of the 405B models include colons making them incompatible with windows systems. EX: Meta-Llama3.1-405B-Instruct:bf16-mp8 OSError: [WinError 123] The filename, directory name, or...

kylelbelcher

Missing Security Policy

Hi dear team! I love your work. I wanted to ask, how should one report about security bugs / vulnerabilities? I would like to report a security vulnerability that I...

avioligo

'HuggingFaceLLM' object has no attribute '_llm_type'

``` import torch from transformers import AutoModelForCausalLM, AutoTokenizer, AwqConfig model_id = "hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4" llm = HuggingFaceLLM( context_window=8192, #4096 max_new_tokens=512, generate_kwargs={"temperature": 0, "do_sample": False}, system_prompt=system_prompt, query_wrapper_prompt=query_wrapper_prompt, tokenizer_name=model_id, model_name=model_id, device_map="auto", tokenizer_kwargs={"max_length": 8192} #...

Rumeysakeskin

llama-stack
llama-stack copied to clipboard

Metadata

add Weaviate memory adapter

Adding Together as safety API provider

Quick example illustrating `get_request_provider_data`

add databricks provider

Inference Failed Because of '500 Internal Server Error'

FP8 Quantization Does Not Work

Checkpoint Cannot Be Found For Llama 405B Model

Colons in filenames when using llama download make them incompatible with windows.

Missing Security Policy

'HuggingFaceLLM' object has no attribute '_llm_type'

← Metadata

Owner

Metadata

llama-stack llama-stack copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama-stack
llama-stack copied to clipboard