llama-stack
                                
                                 llama-stack copied to clipboard
                                
                                    llama-stack copied to clipboard
                            
                            
                            
                        Adds groq inference adapter.
What does this PR do?
This PR adds a groq inference adapter.
Key features implemented:
- Chat completion API with streaming support
- Distribution template for easy deployment
What it does not support:
- Text completion API
- Embeddings API
- Certain OpenAI features:
- logprobs and top_logprobs
- response_format options
 
Test Plan
Run the following test command:
pytest -s -v --providers inference=groq llama_stack/providers/tests/inference/ --env groq_API_KEY=<your-api-key>
To test the distribution template:
# Docker
LLAMA_STACK_PORT=5001
docker run -it -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \
  llamastack/distribution-groq \
  --port $LLAMA_STACK_PORT \
  --env groq_API_KEY=$groq_API_KEY
# Conda
llama stack build --template groq --image-type conda
llama stack run ./run.yaml \
  --port $LLAMA_STACK_PORT \
  --env groq_API_KEY=$groq_API_KEY
Sources
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [x] Read the contributor guideline, Pull Request section
- [ ] Updated relevant documentation.
- [x] Wrote necessary unit or integration tests.