Support a variable Python base image name
Currently the base image is customisable by setting METAFLOW_BATCH_CONTAINER_IMAGE or passing image - if not specified this falls back to the users Python versions, e.g. python:3.9.2
This image name doesn't work for strategies using non-Dockerhub docker registries, where the Python image name may vary, e.g. when using AWS ECR pull through cache, the ecr-public Python image name is <account_id>.dkr.ecr.<aws_region>.amazonaws.com/public-ecr/docker/library/python:<version>. Using a registry such as this is useful to avoid being rate limited by Dockerhub without auth.
This requires the prefix /public-ecr/docker/library on the image to allow compatibility with non-pull-through-cache images in the same ECR registry.
I'd suggest making the default python image name customisable with a new config/environment variable to support these configurations.
Current workarounds
It's currently possible to workaround this by setting METAFLOW_BATCH_CONTAINER_REGISTRY=<account_id>.dkr.ecr.<aws_region>.amazonaws.com/public-ecr/docker/library and then using the full ECR URL of any non-default images. This will result in python using the ECR pull-through cache, and the registry being derived from the image URL for non-default images.
@Limess The workaround you referenced is indeed the solution we were going to recommend. For more involved use cases, we would ideally want to provide users with more fine-grained controls to determine the exact image to use. It's likely that users may want to compute a different image version compared to the one being used today. Let me think about what this user-specified translation layer can look like in practice and get back to you.