llama-stack issues

Run ollama gpu distribution failed

2

### System Info NVIDIA GPU A30 nvidia-smi ``` Thu Oct 31 11:43:51 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 555.42.02 Driver Version: 555.42.02 CUDA Version: 12.5 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M |...

alexhegit

Significantly simpler and malleable test setup

2

# What does this PR do? Significantly simplifies running tests. Previously you ran tests by doing: ```bash MODEL_ID= PROVIDER_ID= PROVIDER_CONFIG=config.yaml pytest -s llama_stack/providers/tests/inference/test_inference.py ``` This was pretty annoying because -...

ashwinb

CLA Signed

[docs] update documentations

# What does this PR do? - Update documentations ## Feature/Issue validation/testing/test plan - https://llama-stack.readthedocs.io/en/latest/ ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This...

yanxi0830

CLA Signed

fix bedrock impl

This is needed after the async refactor. testplan: * llama stack build --template bedrock --image-type conda * llama stack run bedrock --port 5000 * python examples/inference/client.py localhost 5000 ``` User>hello...

dineshyv

CLA Signed

add bedrock distribution code

testing steps using conda: * llama stack build --template bedrock --image-type conda * llama stack configure bedrock * llama stack run bedrock --port 5000 * verified models are registered by...

dineshyv

CLA Signed

LLamaGuard, routing, and vllm

6

### System Info Cuda 12.6, torch 2.5.1, nvidia gpu ### Information - [X] The official example scripts - [ ] My own modified scripts ### 🐛 Describe the bug I'm...

stevegrubb

persist registered objects with distribution

This is a WIP to get quick feedback. Main things i need feedback on: 1) Accessing registry as a global variable. Other option is to pass along either the dist...

dineshyv

CLA Signed

add NVIDIA NIM inference adapter

# What does this PR do? this PR adds a basic inference adapter to NVIDIA NIMs what it does not do - - support streaming - have test coverage for...

mattf

CLA Signed

[Evals API][9/n] SimpleQA evals

--continuation of https://github.com/meta-llama/llama-stack/pull/352 # TL;DR 1. Implement [OpenAI's SimpleQA's Benchmark](https://openai.com/index/introducing-simpleqa/) as ScoringFn ([reference](https://github.com/openai/simple-evals/blob/main/simpleqa_eval.py)) [RFC] - Option 1: SimpleQAScoringFn: Move each benchmark eval into separate scoring function with it's own context....

yanxi0830

CLA Signed

[Evals API][8/n] AnswerParsingScoringFn for MMLU

--continuation of https://github.com/meta-llama/llama-stack/pull/333 # TL;DR 1. Introduce a registerable AnswerParsingScoringFn with AnswerParsingScoringContext for registering scoring functions with context 2. Remove parameters field (context alone is sufficient for things related to...

yanxi0830

CLA Signed

llama-stack
llama-stack copied to clipboard

Metadata

Run ollama gpu distribution failed

Significantly simpler and malleable test setup

[docs] update documentations

fix bedrock impl

add bedrock distribution code

LLamaGuard, routing, and vllm

persist registered objects with distribution

add NVIDIA NIM inference adapter

[Evals API][9/n] SimpleQA evals

[Evals API][8/n] AnswerParsingScoringFn for MMLU

← Metadata

Owner

Metadata

llama-stack llama-stack copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama-stack
llama-stack copied to clipboard