Does metaflow run on an on-prem Kubernetes cluster?
I do see issues and pull requests like this:
- https://github.com/Netflix/metaflow/issues/50
- https://github.com/Netflix/metaflow/pull/644
- https://github.com/Netflix/metaflow/pull/488
Does this mean metaflow will run fully on our on-prem Kubernetes cluster or is there still a requirement for Amazon services?
@Spenhouet Yes, you can execute Metaflow workflows on any Kubernetes cluster. We currently have a dependency on Amazon S3 for datastore but it is straightforward to swap that out with GCS/Azure Blob Store/MinIO etc. (particularly after #580 has been merged in). What does your setup look like? If you are using MinIO (or any other S3 compatible blob storage) as your on-prem blob store, we should be able to enable you to start experimenting with k8s on-prem right away. Ping us at http://slack.outerbounds.co and one of us will get you installation details.
I have installed a metaflow + ui on premise kubernetes with a minio. Running the hello world tutorial with --with kubernetes it seems that with default container the command generated by the step here could be improved to be minio compatible
https://github.com/Netflix/metaflow/blob/62f5e52ebce755d9130287c1576011cb056e0e3d/metaflow/metaflow_environment.py#L92
python-m awscli s3 cp s3://someurlto.tar job.tar As far as i kno,w no env variable could override s3 endpoint
it could be parameterized to python-m awscli --url-endpoint "METAFLOW_S3_ENDPOINT_URL" s3 cp s3://someurlto.tar job.tar
@alexisdondon Passing in a custom endpoint url is now enabled for kubernetes.