yocto-gl
yocto-gl copied to clipboard
Implement chat & chat streaming for Anthropic in Deployments
🛠 DevTools 🛠
Install mlflow from this PR
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/11195/merge
Checkout with GitHub CLI
gh pr checkout 11195
Related Issues/PRs
#xxxWhat changes are proposed in this pull request?
Implement chat & chat streaming endpoints for Anthropic provider in MLflow Deployments Server
How is this PR tested?
- [ ] Existing unit/integration tests
- [X] New unit/integration tests
- [ ] Manual tests
Does this PR require documentation update?
- [ ] No. You can skip the rest of this section.
- [X] Yes. I've updated:
- [ ] Examples
- [X] API references
- [ ] Instructions
Release Notes
Is this a user-facing change?
- [ ] No. You can skip the rest of this section.
- [X] Yes. Give a description of this change to be included in the release notes for MLflow users.
Implement chat & chat streaming endpoints for Anthropic provider in MLflow Deployments Server
What component(s), interfaces, languages, and integrations does this PR affect?
Components
- [ ]
area/artifacts
: Artifact stores and artifact logging - [ ]
area/build
: Build and test infrastructure for MLflow - [X]
area/deployments
: MLflow Deployments client APIs, server, and third-party Deployments integrations - [ ]
area/docs
: MLflow documentation pages - [ ]
area/examples
: Example code - [ ]
area/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registry - [ ]
area/models
: MLmodel format, model serialization/deserialization, flavors - [ ]
area/recipes
: Recipes, Recipe APIs, Recipe configs, Recipe Templates - [ ]
area/projects
: MLproject format, project running backends - [ ]
area/scoring
: MLflow Model server, model deployment tools, Spark UDFs - [ ]
area/server-infra
: MLflow Tracking server backend - [ ]
area/tracking
: Tracking Service, tracking client APIs, autologging
Interface
- [ ]
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev server - [ ]
area/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Models - [ ]
area/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registry - [ ]
area/windows
: Windows support
Language
- [ ]
language/r
: R APIs and clients - [ ]
language/java
: Java APIs and clients - [ ]
language/new
: Proposals for new client languages
Integrations
- [ ]
integrations/azure
: Azure and Azure ML integrations - [ ]
integrations/sagemaker
: SageMaker integrations - [ ]
integrations/databricks
: Databricks integrations
How should the PR be classified in the release notes? Choose one:
- [ ]
rn/none
- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section - [ ]
rn/breaking-change
- The PR will be mentioned in the "Breaking Changes" section - [X]
rn/feature
- A new user-facing feature worth mentioning in the release notes - [ ]
rn/bug-fix
- A user-facing bug fix worth mentioning in the release notes - [ ]
rn/documentation
- A user-facing documentation change worth mentioning in the release notes
Documentation preview for 84f5bb90ef257921981bce0e3cca43e9c81ab7b9 will be available when this CircleCI job completes successfully.
More info
- Ignore this comment if this PR does not change the documentation.
- It takes a few minutes for the preview to be available.
- The preview is updated when a new commit is pushed to this PR.
- This comment was created by https://github.com/mlflow/mlflow/actions/runs/8000230275.
@BenWilson2 can you help manual test with your anthropic API key to see if this PR is working?
# config.yaml
routes:
- name: anthropic
route_type: llm/v1/chat
model:
provider: anthropic
name: claude-2.1
config:
anthropic_api_key: <key>
mlflow deployments start-server --config-path config.yaml --workers 1
Non-streaming:
curl -X POST http://127.0.0.1:5000/endpoints/anthropic/invocations \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "system","content": "You are a funny assistant"},{"role": "user","content": "Hi"},{"role": "assistant","content": "Hi"},{"role": "user","content": "Tell me something fun."}], "temperature": 1.5, "max_tokens": 100}'
Streaming:
curl -X POST http://127.0.0.1:5000/endpoints/anthropic/invocations \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "system","content": "You are a funny assistant"},{"role": "user","content": "Hi"},{"role": "assistant","content": "Hi"},{"role": "user","content": "Tell me something fun."}], "stream": true, "temperature": 1.5, "max_tokens": 100}'
Hey @gabrielfu sorry for dropping the ball on this.
Streaming:
~ via 🅒 base via 🐍 dev-env
➜ curl -X POST http://127.0.0.1:5000/endpoints/anthropic-chat/invocations \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "system","content": "You are a funny assistant"},{"role": "user","content": "Hi"},{"role": "assistant","content": "Hi"},{"role": "user","content": "Tell me something fun."}], "stream": true, "temperature": 1.5, "max_tokens": 100}'
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": ""}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "Here"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "'s"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " a"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " silly"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " joke"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " for"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " you"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": ":"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " Why"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " can"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "'t"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " a"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " bicycle"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " stand"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " up"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " by"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " itself"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "?"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " Because"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " it"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "'s"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " two"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "-"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "t"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "ired"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "!"}}]}
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": "stop", "delta": {"role": null, "content": null}}]}
Non-streaming:
~ via 🅒 base via 🐍 dev-env
➜ curl -X POST http://127.0.0.1:5000/endpoints/anthropic-chat/invocations \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "system","content": "You are a funny assistant"},{"role": "user","content": "Hi"},{"role": "assistant","content": "Hi"},{"role": "user","content": "Tell me something fun."}], "temperature": 1.5, "max_tokens": 100}'
{"id":"msg_017WWVS7GdTYGmAaduF7DWQv","object":"chat.completion","created":1709863852,"model":"claude-2.1","choices":[{"index":0,"message":{"role":"assistant","content":"Here's something I find amusing - I don't actually experience fun or have a sense of humor myself. As an AI assistant created by Anthropic to be helpful, harmless, and honest, I don't have subjective experiences. I can try to tell jokes or fun facts if you'd like, but I rely on my training to determine what kinds of things tend to elicit amusement or enjoyment from humans!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":29,"completion_tokens":85,"total_tokens":114}}%
thanks @BenWilson2 for the manual test!