Ashwin Bharambe

Results 14 issues of Ashwin Bharambe

We should use Inference APIs to execute Llama Guard instead of directly needing to use HuggingFace APIs. The actual inference consideration is handled by Inference.

CLA Signed

In the previous design, the server endpoint at the top-most level extracted the headers from the request and set provider data (e.g., private keys) that the implementations could retrieve using...

CLA Signed

Most of the current inference providers only implement the `chat_completion()` method. The `completion()` method raises a `NotImplementedError`. We should implement this method for all the inference providers: - meta-reference -...

good first issue

This PR makes several core changes to the developer experience surrounding Llama Stack. **Background:** PR https://github.com/meta-llama/llama-stack/pull/92 introduced the notion of "routing" to the Llama Stack. It introduces three object types:...

CLA Signed

Added support for structured output in the API and added a reference implementation for meta-reference. A few notes: - Two formats are specified in the API: Json schema and EBNF...

CLA Signed

# What does this PR do? Significantly simplifies running tests. Previously you ran tests by doing: ```bash MODEL_ID= PROVIDER_ID= PROVIDER_CONFIG=config.yaml pytest -s llama_stack/providers/tests/inference/test_inference.py ``` This was pretty annoying because -...

CLA Signed

### System Info ... ### Information - [ ] The official example scripts - [ ] My own modified scripts ### 🐛 Describe the bug vLLM does not work when...

### 🚀 The feature, motivation and pitch (fireworks, together, meta-reference) support guided decoding (specifying a json-schema for example, as a "grammar" for decoding) with inference. vLLM supports this functionality --...

good first issue

### 🚀 The feature, motivation and pitch We have a decently flexible testing system for testing various combinations of providers when composing a Llama Stack. See https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/tests/README.md We need to...

good first issue