binderhub icon indicating copy to clipboard operation
binderhub copied to clipboard

fixes to make local registry (unauthenticated) usable

Open wolfv opened this issue 5 years ago • 4 comments

These are a couple of small fixes to make images of type localhost:32000/my-image work throughout binder.

With these fixes I was able to set up a microk8s single-machine cluster that serves binder.

wolfv avatar Oct 16 '20 07:10 wolfv

Thanks for submitting your first pull request! You are awesome! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please make sure you followed the pull request template, as this will help us review your contribution more quickly. welcome You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

welcome[bot] avatar Oct 16 '20 07:10 welcome[bot]

hmm i am just realizing that this might break docker images that are like org/some-docker... maybe I should check that the splitted prefix matches the registry location.

wolfv avatar Oct 16 '20 08:10 wolfv

Ah, on re-reading more carefully, I realize that a local registry can have top-level images (i.e. localhost:3200/image:tag), and our code otherwise assumes that all images have the form repo/image:tag so there's exactly one slash in the repository name itself. Does it work if you do it that way, e.g. imagePrefix: "localhost:3200/binder/r2d-" without any other changes?

I think it would probably be good to actually parse this info rather than our increasingly nested sequence of image splits.

What I don't understand, though, is how docker determines if a tag of the form a/b is actually docker.io/a/b or http[s]://a/b. Does it have special handling of localhost and otherwise check for . in the first part? From the docker spec for "Repository":

Repository A collection of tags grouped under a common prefix (the name component before :). For example, in an image tagged with the name my-app:3.1.4, my-app is the Repository component of the name. A repository name is made up of slash-separated name components, optionally prefixed by a DNS hostname. The hostname must comply with standard DNS rules, but may not contain _ characters. If a hostname is present, it may optionally be followed by a port number in the format :8080. Name components may contain lowercase characters, digits, and separators. A separator is defined as a period, one or two underscores, or one or more dashes. A name component may not start or end with a separator.

I've tested with a local registry running on port 80 so I could exclude port info:

docker run --rm -d -p 127.0.0.1:80:5000 --name registry registry:2

and I was able to push images with 0-many / separators, so our code's assumption that there's exactly 1 isn't always valid:

docker tag registry:2 localhost/registry:2
docker push localhost/registry:2
docker tag registry:2 localhost/a/b/c/d/registry:2
docker push localhost/a/b/c/d/registry:2

Docker's implementation defines a domain prefix as containing . or : with a special handling of localhost only (other port- or tld-less hosts won't work):

# added 127.0.0.1 testhost to /etc/hosts
$ docker tag registry:2 testhost/registry:2
$ docker push testhost/registry:2
The push refers to repository [docker.io/testhost/registry]
...

So we can implement proper image string parsing to always return registry, repository, and tag:

def parse_image(image_name):
    """Parse a docker image string into (registry, repository, tag)"""
    # image_name could be "localhost:5000/foo:tag" or "gcr.io/project/subrepo/sub/sub/sub:tag" or "org/repo:tag" for implicit docker.io
    registry, _, repo_tag = image_name.partition("/")
    if registry == "localhost" or any(c in registry for c in (".", ":")):
        # registry host is specified
        pass
    else:
        # registry not given, use implicit default docker.io registry
        registry = "docker.io"
        repo_tag = image_name
    repository, _, tag = repo_tag.partition(":")
    return (registry, repository, tag)

That eliminates our assumptions about how many / characters are in the repository string. (I used partition here since it always returns the same number and type of items, unlike split which depends on the presence of the separator).

minrk avatar Oct 30 '20 09:10 minrk

Hi! @wolfv @minrk

I was running in the exact same situation.

Does it work if you do it that way, e.g. imagePrefix: "localhost:3200/binder/r2d-" without any other changes?

Yes! Using something like your-registry.io/<some-project-name>/<prefix>- as image_prefix is working fine, as stated in the docs.

The parse_image looks fine! Maybe we could also add some validation to image_prefix to be sure it's not prefixed by http[s]:// that would cause parse_image returning wrong information ? The zero-to-binderhub doc should be updated accordingly.

I would be happy to help :)

adriendelsalle avatar Feb 17 '21 11:02 adriendelsalle