llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

Adds groq inference adapter.

Open swanhtet1992 opened this issue 11 months ago • 0 comments

What does this PR do?

This PR adds a groq inference adapter.

Key features implemented:

  • Chat completion API with streaming support
  • Distribution template for easy deployment

What it does not support:

  • Text completion API
  • Embeddings API
  • Certain OpenAI features:
    • logprobs and top_logprobs
    • response_format options

Test Plan

Run the following test command:

pytest -s -v --providers inference=groq llama_stack/providers/tests/inference/ --env groq_API_KEY=<your-api-key>

To test the distribution template:

# Docker
LLAMA_STACK_PORT=5001
docker run -it -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \
  llamastack/distribution-groq \
  --port $LLAMA_STACK_PORT \
  --env groq_API_KEY=$groq_API_KEY

# Conda
llama stack build --template groq --image-type conda
llama stack run ./run.yaml \
  --port $LLAMA_STACK_PORT \
  --env groq_API_KEY=$groq_API_KEY

Sources

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [ ] Ran pre-commit to handle lint / formatting issues.
  • [x] Read the contributor guideline, Pull Request section
  • [ ] Updated relevant documentation.
  • [x] Wrote necessary unit or integration tests.

swanhtet1992 avatar Nov 24 '24 09:11 swanhtet1992