yocto-gl icon indicating copy to clipboard operation
yocto-gl copied to clipboard

Extract databricks info from langchain flavor

Open liangz1 opened this issue 1 year ago • 1 comments

🛠 DevTools 🛠

Open in GitHub Codespaces

Install mlflow from this PR

pip install git+https://github.com/mlflow/mlflow.git@refs/pull/10969/merge

Checkout with GitHub CLI

gh pr checkout 10969

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

mlflow.langchain.log_model detects the databricks dependencies of a langchain model and saves a dictionary of detected endpoint names and index names in the MLmodel file under "langchain" flavor.

What information do we extract?

from langchain_community.embeddings import DatabricksEmbeddings
# from DatabricksEmbeddings, extract databricks_embeddings_endpoint_name

from langchain.vectorstores import DatabricksVectorSearch
# from DatabricksVectorSearch, extract databricks_vector_search_index_name and databricks_vector_search_endpoint_name

from langchain_community.llms import Databricks
# from Databricks, extract databricks_llm_endpoint_name

from langchain.chat_models import ChatDatabricks
# from ChatDatabricks, extract databricks_chat_endpoint_name

What Chain do we support?

Arbitrary LCEL chains. Legacy chains have limited support. Only RetrievalQA, StuffDocumentsChain, ReduceDocumentsChain, RefineDocumentsChain, MapRerankDocumentsChain, MapReduceDocumentsChain, BaseConversationalRetrievalChain are supported. If you need to support a custom chain, you need to monkey patch the function mlflow.langchain.databricks_dependencies._extract_dependency_dict_from_lc_model().

Reference: Here is a list of all built-in LCEL chains and legacy chains in langchain: https://python.langchain.com/docs/modules/chains.

How is this PR tested?

  • [x] Existing unit/integration tests
  • [x] New unit/integration tests
  • [x] Manual tests

Does this PR require documentation update?

  • [x] No. You can skip the rest of this section.
  • [ ] Yes. I've updated:
    • [ ] Examples
    • [ ] API references
    • [ ] Instructions

Release Notes

Is this a user-facing change?

  • [x] No. You can skip the rest of this section.
  • [ ] Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • [ ] area/artifacts: Artifact stores and artifact logging
  • [ ] area/build: Build and test infrastructure for MLflow
  • [ ] area/deployments: MLflow Deployments client APIs, server, and third-party Deployments integrations
  • [ ] area/docs: MLflow documentation pages
  • [ ] area/examples: Example code
  • [ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • [x] area/models: MLmodel format, model serialization/deserialization, flavors
  • [ ] area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
  • [ ] area/projects: MLproject format, project running backends
  • [ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • [ ] area/server-infra: MLflow Tracking server backend
  • [ ] area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • [ ] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • [ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • [ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • [ ] area/windows: Windows support

Language

  • [ ] language/r: R APIs and clients
  • [ ] language/java: Java APIs and clients
  • [ ] language/new: Proposals for new client languages

Integrations

  • [ ] integrations/azure: Azure and Azure ML integrations
  • [ ] integrations/sagemaker: SageMaker integrations
  • [x] integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • [x] rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • [ ] rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • [ ] rn/feature - A new user-facing feature worth mentioning in the release notes
  • [ ] rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • [ ] rn/documentation - A user-facing documentation change worth mentioning in the release notes

liangz1 avatar Jan 31 '24 21:01 liangz1

Documentation preview for 726d65837490cc281a4bb063d3a011d165ce2afc will be available when this CircleCI job completes successfully.

More info
  • Ignore this comment if this PR does not change the documentation.
  • It takes a few minutes for the preview to be available.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by https://github.com/mlflow/mlflow/actions/runs/7833754897.

github-actions[bot] avatar Jan 31 '24 21:01 github-actions[bot]