yocto-gl icon indicating copy to clipboard operation
yocto-gl copied to clipboard

Implement chat & chat streaming for Anthropic in Deployments

Open gabrielfu opened this issue 1 year ago • 2 comments

🛠 DevTools 🛠

Open in GitHub Codespaces

Install mlflow from this PR

pip install git+https://github.com/mlflow/mlflow.git@refs/pull/11195/merge

Checkout with GitHub CLI

gh pr checkout 11195

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

Implement chat & chat streaming endpoints for Anthropic provider in MLflow Deployments Server

How is this PR tested?

  • [ ] Existing unit/integration tests
  • [X] New unit/integration tests
  • [ ] Manual tests

Does this PR require documentation update?

  • [ ] No. You can skip the rest of this section.
  • [X] Yes. I've updated:
    • [ ] Examples
    • [X] API references
    • [ ] Instructions

Release Notes

Is this a user-facing change?

  • [ ] No. You can skip the rest of this section.
  • [X] Yes. Give a description of this change to be included in the release notes for MLflow users.

Implement chat & chat streaming endpoints for Anthropic provider in MLflow Deployments Server

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • [ ] area/artifacts: Artifact stores and artifact logging
  • [ ] area/build: Build and test infrastructure for MLflow
  • [X] area/deployments: MLflow Deployments client APIs, server, and third-party Deployments integrations
  • [ ] area/docs: MLflow documentation pages
  • [ ] area/examples: Example code
  • [ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • [ ] area/models: MLmodel format, model serialization/deserialization, flavors
  • [ ] area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
  • [ ] area/projects: MLproject format, project running backends
  • [ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • [ ] area/server-infra: MLflow Tracking server backend
  • [ ] area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • [ ] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • [ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • [ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • [ ] area/windows: Windows support

Language

  • [ ] language/r: R APIs and clients
  • [ ] language/java: Java APIs and clients
  • [ ] language/new: Proposals for new client languages

Integrations

  • [ ] integrations/azure: Azure and Azure ML integrations
  • [ ] integrations/sagemaker: SageMaker integrations
  • [ ] integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • [ ] rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • [ ] rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • [X] rn/feature - A new user-facing feature worth mentioning in the release notes
  • [ ] rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • [ ] rn/documentation - A user-facing documentation change worth mentioning in the release notes

gabrielfu avatar Feb 20 '24 06:02 gabrielfu

Documentation preview for 84f5bb90ef257921981bce0e3cca43e9c81ab7b9 will be available when this CircleCI job completes successfully.

More info
  • Ignore this comment if this PR does not change the documentation.
  • It takes a few minutes for the preview to be available.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by https://github.com/mlflow/mlflow/actions/runs/8000230275.

github-actions[bot] avatar Feb 20 '24 06:02 github-actions[bot]

@BenWilson2 can you help manual test with your anthropic API key to see if this PR is working?

# config.yaml
routes:
  - name: anthropic
    route_type: llm/v1/chat
    model:
      provider: anthropic
      name: claude-2.1
      config:
        anthropic_api_key: <key>
mlflow deployments start-server --config-path config.yaml --workers 1

Non-streaming:

curl -X POST http://127.0.0.1:5000/endpoints/anthropic/invocations \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "system","content": "You are a funny assistant"},{"role": "user","content": "Hi"},{"role": "assistant","content": "Hi"},{"role": "user","content": "Tell me something fun."}], "temperature": 1.5, "max_tokens": 100}'

Streaming:

curl -X POST http://127.0.0.1:5000/endpoints/anthropic/invocations \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "system","content": "You are a funny assistant"},{"role": "user","content": "Hi"},{"role": "assistant","content": "Hi"},{"role": "user","content": "Tell me something fun."}], "stream": true, "temperature": 1.5, "max_tokens": 100}'

gabrielfu avatar Feb 20 '24 07:02 gabrielfu

Hey @gabrielfu sorry for dropping the ball on this.

Streaming:

~ via 🅒 base via 🐍 dev-env
➜   curl -X POST http://127.0.0.1:5000/endpoints/anthropic-chat/invocations \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "system","content": "You are a funny assistant"},{"role": "user","content": "Hi"},{"role": "assistant","content": "Hi"},{"role": "user","content": "Tell me something fun."}], "stream": true, "temperature": 1.5, "max_tokens": 100}'
data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": ""}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "Here"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "'s"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " a"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " silly"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " joke"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " for"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " you"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863952, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": ":"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " Why"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " can"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "'t"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " a"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " bicycle"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " stand"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " up"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " by"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " itself"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "?"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " Because"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " it"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "'s"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": " two"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "-"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "t"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "ired"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": null, "delta": {"role": null, "content": "!"}}]}

data: {"id": "msg_013uCEMi8ow4BRipK6qRzrmW", "object": "chat.completion.chunk", "created": 1709863953, "model": "claude-2.1", "choices": [{"index": 0, "finish_reason": "stop", "delta": {"role": null, "content": null}}]}

Non-streaming:

~ via 🅒 base via 🐍 dev-env
➜ curl -X POST http://127.0.0.1:5000/endpoints/anthropic-chat/invocations \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "system","content": "You are a funny assistant"},{"role": "user","content": "Hi"},{"role": "assistant","content": "Hi"},{"role": "user","content": "Tell me something fun."}], "temperature": 1.5, "max_tokens": 100}'
{"id":"msg_017WWVS7GdTYGmAaduF7DWQv","object":"chat.completion","created":1709863852,"model":"claude-2.1","choices":[{"index":0,"message":{"role":"assistant","content":"Here's something I find amusing - I don't actually experience fun or have a sense of humor myself. As an AI assistant created by Anthropic to be helpful, harmless, and honest, I don't have subjective experiences. I can try to tell jokes or fun facts if you'd like, but I rely on my training to determine what kinds of things tend to elicit amusement or enjoyment from humans!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":29,"completion_tokens":85,"total_tokens":114}}%

BenWilson2 avatar Mar 08 '24 02:03 BenWilson2

thanks @BenWilson2 for the manual test!

gabrielfu avatar Mar 08 '24 03:03 gabrielfu