metaflow icon indicating copy to clipboard operation
metaflow copied to clipboard

Docs for fully on-prem

Open tpanza opened this issue 9 months ago • 2 comments

The doc page for Supported Infrastructure Components says:

You can choose to deploy Metaflow on: ... Any Kubernetes cluster including on-premise deployments.

But that link points to AWS installation docs.

I see #682 about this topic, and the discussion there suggests that, yes, Metaflow can run on fully on-prem infra. But I am having doubts because there are no docs on this subject, and the one mention I can find about this points to an AWS doc.

Where can I learn more about the steps involved for doing an installation onto an on-prem k8s cluster? And are there any limitations if I were to do so?

In case the details matter for my particular case: I have a 10-node on-prem RKE2 cluster, containing a mixture of GPUs and CPU-only nodes and a Ceph storage cluster. Looking for options for MLOps framework to run on this.

tpanza avatar Apr 01 '25 18:04 tpanza

+1 to this — We are in a similar situation and would really appreciate clearer guidance on fully on-prem installations. The mention of “any Kubernetes cluster” is encouraging, but it’s a bit confusing when the only linked docs are AWS-specific. Thanks!

stepanek-petr avatar Apr 05 '25 10:04 stepanek-petr

I've created PR #2521 to address this documentation gap. The PR includes:

  • Comprehensive guide for on-premises Kubernetes deployment
  • MinIO setup for S3-compatible storage
  • Example flow that works on any K8s cluster
  • Specific instructions for RKE2 and Ceph (as mentioned in your use case)

@tpanza @stepanek-petr - Would appreciate your feedback on whether this addresses your needs!

JonSnow1807 avatar Jul 25 '25 00:07 JonSnow1807