Support sync of whole repository
Great project! One idea that I have for enhancement is to add possibility to sync whole repository.
For example:
I have myuser at Docker Hub with some containers and myotheruser at Quay and I want to sync all containers from Docker to Quay just in case one of platforms is down.
Just an idea which would be cool and possibly useful for companies that maintain multiple registries with the same containers. :)
Thanks for opening this issue @Dysproz! Overall, I think this makes sense and goes well with #7 . I'll start getting this done for you.
@Dysproz would the intent for this functionality to be used mostly as a one time command or to store it in the manifest file? I think it could make sense to support both, but would rather directly support your use case first. i.e. would you expect:
sinker push docker.io/myrepo:* quay.io/myrepo
or
target
host: quay.io
sources:
- repository: myrepo
host: docker.io
tag: *
I think that manifest would better option as it may be easier to read and understand manifest file for DevOps instead of going through pipeline commands. :)
Sounds good. Lastly for your use case, do you have a preference or thought how the list command would print out images with *?
Using * means the list couldn't directly be piped into tooling because * is an invalid character-- but listing all tags could be really noisy and could change based on what was pushed to the remote.
My gut says to ignore the * functionality for some commands.
Yeah, I agree that it may be ignored for some commands as it makes no sense e.g. listing tags of all images in repository. :)
@Dysproz as an update, the code base is now in a very good spot to support this now. I'm still working out the best way to expose this for consumers.
Having some images displayed in the list command and not others could be confusing. But printing out all tags would also be noisy, and require pulling back all tags every time.
I debate whether or not this functionality should be considered separate (potentially another section in the manifest, e.g. repositories), as I do feel it is a smaller use-case.
The more I think about this, the more I lean towards a new root level type.
sources should probably be renamed to images
Add a repositories root type which will sync entire repos.
Then the command line interaction could be..
sinker list images --source
sinker list repositories --source
sinker pull repositories
sinker pull images --source
etc.
Since I foresee images being the primary use-case, and to keep the workflow consistent for current users, may make it such that if images/repositories is not defined on the command line, it defaults to images.
sinker pull --target
This cleanly separates the two different concerns, without a bunch of branching logic based on asterisks in the tags field, no tag at all, etc.
target:
host: myhost.com
repository: my/repo
images:
- repository: coreos/prometheus-operator
host: quay.io
tag: v0.39.0
repositories:
- path: my/repo
host: quay.io
I currently use lstags for syncing. It supports all kinds of fancy regexes for defining versions (but of course they come with their own caveats)
Maybe supporting a list of tags for syncing images would come handy as well (do not know how to handle digest, though):
target:
host: myhost.com
repository: my/repo
images:
- repository: library/debian
host: registry.hub.docker.com
tag:
- stretch
- stretch-slim
- buster
- buster-slim
- testing
- sid
It becomes problematic deciding when a tag should be considered immutable, and when a tag should not.
If you're syncing image:v1.0.0, and that image already exists at the target, the current behavior is to not sync that image at all.
If you're syncing image:latest, the current behavior is to always sync the image (though improvements could be made to compare digests).
In the above example, I would imagine that the expected behavior for stretch, stretch-slim, etc would be similar to latest--always sync. Though keeping this sort of list wouldn't be ideal.
A potential solution could be to assume that any tag without a number takes on the always sync behavior (e.g. like latest). I do think it makes sense because users who typically desire immutable tags will tag with numbers, but it feels like a weird way to go about it.
The other way would be to introduce a new section/property, something like overwrite that gives users more control.
I would go for the override keyword: look at database images like e.g. postgresql: 11, 11.6 are moving, 11.6.3 is not AFAIK
I would go for the override keyword: look at database images like e.g. postgresql: 11, 11.6 are moving, 11.6.3 is not AFAIK
Good point. Override it is!
Hi,
any update on this feature request? Is there any way to sync a whole repo?
@dyipon at the moment there is no way to do this without defining the images and tags up front. I haven't been able to get to this, but pull requests are always welcome!