dregsy icon indicating copy to clipboard operation
dregsy copied to clipboard

question about the `keep` keyword

Open nia-potato opened this issue 11 months ago • 8 comments

- name: aquasec/trivy
  verbose: true
  mappings:
  - from: aquasec/trivy
    to: someplace/dockerhub/aquasec
    tags:
    - 0.40.0
    - 0.41.0
    - 'keep: latest 5'
    platform: all
    

hi @xelalexv , just wanted to ask you a quick question about the keep keyword: the keep keyword is not designed for keeping images tags up to date dynamically, since it only goes through the pruning the 5 tags that contains latest (ie the example above) as defined in the existing tags: definition.

If i want to just keep the latest 5 tags of a image, i cannot just do (latest 5 tags meaning if the tags were latest, 4, 3, 2, 1) the keep keyword would current not be suited for only keeping (latest, 4, 3, 2, 1) tags instead it would go through all the list of the tags, and only import tags that contains latest.


- name: aquasec/trivy
  verbose: true
  mappings:
  - from: aquasec/trivy
    to: someplace/dockerhub/aquasec
    tags:
    - 'keep: latest 5'
    platform: all
    

If that is correct, what would be the best way to keep the most recent 5 tags, (the use case being a daily job that makes sure we have the latest 5 tags of different images in the registry)

nia-potato avatar Feb 26 '24 07:02 nia-potato

The keep: latest n is designed for dynamically selecting the latest n tags available at the time of syncing. Note however that this pruning filter works exclusively on semver compliant tags. Verbatim tags included in your mapping would not be touched (unless there are no semver tags at all, see README), i.e. they are included in addition to what keep: latest n selects.

So if you want to make sure you sync verbatim tag latest plus the 5 most recent versions on each sync run, you can do:

tags:
  - 'latest'
  - 'semver: >=0.0.1'
  - 'keep: latest 5'

Instead of 0.0.1 you can use a later version, depending on your particular image.

xelalexv avatar Feb 26 '24 11:02 xelalexv

Thank you @xelalexv , i've tried it and some of the images seemed to work with that addition, however i think i am missing something to fit my current usecase:

I keep a large and growing dregsy config, with each public docker images having different way of labeling their images, thus i cannot independently manage their semver/regex rules, however i am trying to run a daily job that makes sure that each image has the 5 most recent images sync'd to our internal registry. then run lifecyle rules on top it.

i tried the above suggestions, but i would still hit issues like this

DEBU[0084] tags expanded from semver: [0.1.0 0.2.0 0.3.0 0.3.2 0.3.3 0.4.0 0.5.0 0.5.0-rc.10 0.5.0-rc.11 0.5.0-rc.12 0.5.0-rc.2 0.5.0-rc.3 0.5.0-rc.4 0.5.0-rc.5 0.5.0-rc.6 0.5.0-rc.7 0.5.0-rc.8 0.5.0-rc.9 0.5.0-snapshot.uncommitted.feat.dynamic.accounts.6ee7591 0.5.0-snapshot.uncommitted.feat.dynamic.accounts.e84d64b 0.6.0 0.6.0-rc.1 0.6.0-rc.2 0.6.0-rc.3 0.6.0-rc.4 0.6.1 0.6.1-rc.1 0.7.0-rc.1 0444319 1.0.0 1.0.0-rc.2 1.0.1 1.0.1-rc.1 1.0.2 1.0.2-rc.1 1.0.3 1.0.3-rc.1 1.0.3-rc.2 1.1.0 1.1.0-rc.1 1.1.0-rc.10 1.1.0-rc.11 1.1.0-rc.12 1.1.0-rc.13 1.1.0-rc.2 1.1.0-rc.3 1.1.0-rc.4 1.1.0-rc.5 1.1.0-rc.6 1.1.0-rc.7 1.1.0-rc.8 1.1.0-rc.9 1.1.0-snapshot.uncommitted.feat.dynamic.accounts.6ee7591 1.1.0-snapshot.uncommitted.feat.dynamic.accounts.e5cee25 1.1.0-snapshot.uncommitted.feat.ingress.support.24663fe 1.1.0-snapshot.uncommitted.feat.ingress.support.85ce4ac 1.1.0-snapshot.uncommitted.feat.ingress.support.bff2074 1.1.0-snapshot.uncommitted.feat.ingress.support.eadd668 1.1.0-snapshot.uncommitted.fix.validation.webhook.96c3251 1.1.1 1.1.1-rc.1 1.1.2 1.1.2-rc.1 1.1.2-rc.1-ubi 1.1.2-rc.2 1.1.2-rc.2-ubi 1.1.2-ubi 1.2.0 1.2.0-rc.1 1.2.0-rc.10 1.2.0-rc.10-ubi 1.2.0-rc.11 1.2.0-rc.11-ubi 1.2.0-rc.12 1.2.0-rc.12-ubi 1.2.0-rc.13 1.2.0-rc.13-ubi 1.2.0-rc.14 1.2.0-rc.14-ubi 1.2.0-rc.15 1.2.0-rc.15-ubi 1.2.0-rc.16 1.2.0-rc.16-ubi 1.2.0-rc.17 1.2.0-rc.17-ubi 1.2.0-rc.18 1.2.0-rc.18-ubi 1.2.0-rc.2 1.2.0-rc.3 1.2.0-rc.4 1.2.0-rc.5 1.2.0-rc.5-ubi 1.2.0-rc.6 1.2.0-rc.6-ubi 1.2.0-rc.7 1.2.0-rc.7-ubi 1.2.0-rc.8 1.2.0-rc.8-ubi 1.2.0-rc.9 1.2.0-rc.9-ubi 1.2.0-ubi 1.2.1 1.2.1-ubi 1.2.2 1.2.2-rc.1 1.2.2-rc.1-ubi 1.2.2-ubi 1.2.3 1.2.3-rc.1 1.2.3-rc.1-ubi 1.2.3-ubi 1.2.4 1.2.4-rc.1 1.2.4-rc.1-ubi 1.2.4-ubi 1.2.5 1.2.5-rc.1 1.2.5-rc.1-ubi 1.2.5-ubi 1.3.0 1.3.0-rc.1 1.3.0-rc.1-ubi 1.3.0-rc.10 1.3.0-rc.10-ubi 1.3.0-rc.11 1.3.0-rc.11-ubi 1.3.0-rc.12 1.3.0-rc.12-ubi 1.3.0-rc.13 1.3.0-rc.13-ubi 1.3.0-rc.14 1.3.0-rc.14-ubi 1.3.0-rc.15 1.3.0-rc.16 1.3.0-rc.17 1.3.0-rc.18 1.3.0-rc.19 1.3.0-rc.2 1.3.0-rc.2-ubi 1.3.0-rc.20 1.3.0-rc.21 1.3.0-rc.22 1.3.0-rc.23 1.3.0-rc.24 1.3.0-rc.25 1.3.0-rc.26 1.3.0-rc.27 1.3.0-rc.28 1.3.0-rc.29 1.3.0-rc.3 1.3.0-rc.3-ubi 1.3.0-rc.30 1.3.0-rc.31 1.3.0-rc.32 1.3.0-rc.33 1.3.0-rc.34 1.3.0-rc.35 1.3.0-rc.36 1.3.0-rc.37 1.3.0-rc.38 1.3.0-rc.39 1.3.0-rc.4 1.3.0-rc.4-ubi 1.3.0-rc.40 1.3.0-rc.41 1.3.0-rc.42 1.3.0-rc.43 1.3.0-rc.44 1.3.0-rc.45 1.3.0-rc.46 1.3.0-rc.47 1.3.0-rc.48 1.3.0-rc.49 1.3.0-rc.5 1.3.0-rc.5-ubi 1.3.0-rc.50 1.3.0-rc.51 1.3.0-rc.52 1.3.0-rc.53 1.3.0-rc.54 1.3.0-rc.55 1.3.0-rc.56 1.3.0-rc.57 1.3.0-rc.58 1.3.0-rc.6 1.3.0-rc.6-ubi 1.3.0-rc.7 1.3.0-rc.7-ubi 1.3.0-rc.8 1.3.0-rc.8-ubi 1.3.0-rc.9 1.3.0-rc.9-ubi 1.3.0-snapshot.fix.increase.timeout.and.validate.ready.replicas.224c3cc 1.3.1 1.4.0 1.4.0-rc.1 1.4.0-rc.2 1.4.0-rc.3 1.4.0-rc.4 1.5.0-rc.1 1.5.0-rc.2 1.5.0-snapshot.fix.lambda.validation.b511501 8860284] 
DEBU[0084] pruned tags: []                              
DEBU[0084] adding verbatim tags: [latest]               
DEBU[0084] reducing tag set                              limit=5
DEBU[0084] removed tags: [1.4.0 1.4.0-rc.4 1.4.0-rc.3 1.4.0-rc.2 1.4.0-rc.1 1.3.1 1.3.0 1.3.0-snapshot.fix.increase.timeout.and.validate.ready.replicas.224c3cc 1.3.0-rc.9-ubi 1.3.0-rc.8-ubi 1.3.0-rc.7-ubi 1.3.0-rc.6-ubi 1.3.0-rc.5-ubi 1.3.0-rc.4-ubi 1.3.0-rc.3-ubi 1.3.0-rc.2-ubi 1.3.0-rc.14-ubi 1.3.0-rc.13-ubi 1.3.0-rc.12-ubi 1.3.0-rc.11-ubi 1.3.0-rc.10-ubi 1.3.0-rc.1-ubi 1.3.0-rc.58 1.3.0-rc.57 1.3.0-rc.56 1.3.0-rc.55 1.3.0-rc.54 1.3.0-rc.53 1.3.0-rc.52 1.3.0-rc.51 1.3.0-rc.50 1.3.0-rc.49 1.3.0-rc.48 1.3.0-rc.47 1.3.0-rc.46 1.3.0-rc.45 1.3.0-rc.44 1.3.0-rc.43 1.3.0-rc.42 1.3.0-rc.41 1.3.0-rc.40 1.3.0-rc.39 1.3.0-rc.38 1.3.0-rc.37 1.3.0-rc.36 1.3.0-rc.35 1.3.0-rc.34 1.3.0-rc.33 1.3.0-rc.32 1.3.0-rc.31 1.3.0-rc.30 1.3.0-rc.29 1.3.0-rc.28 1.3.0-rc.27 1.3.0-rc.26 1.3.0-rc.25 1.3.0-rc.24 1.3.0-rc.23 1.3.0-rc.22 1.3.0-rc.21 1.3.0-rc.20 1.3.0-rc.19 1.3.0-rc.18 1.3.0-rc.17 1.3.0-rc.16 1.3.0-rc.15 1.3.0-rc.14 1.3.0-rc.13 1.3.0-rc.12 1.3.0-rc.11 1.3.0-rc.10 1.3.0-rc.9 1.3.0-rc.8 1.3.0-rc.7 1.3.0-rc.6 1.3.0-rc.5 1.3.0-rc.4 1.3.0-rc.3 1.3.0-rc.2 1.3.0-rc.1 1.2.5 1.2.5-ubi 1.2.5-rc.1-ubi 1.2.5-rc.1 1.2.4 1.2.4-ubi 1.2.4-rc.1-ubi 1.2.4-rc.1 1.2.3 1.2.3-ubi 1.2.3-rc.1-ubi 1.2.3-rc.1 1.2.2 1.2.2-ubi 1.2.2-rc.1-ubi 1.2.2-rc.1 1.2.1 1.2.1-ubi 1.2.0 1.2.0-ubi 1.2.0-rc.9-ubi 1.2.0-rc.8-ubi 1.2.0-rc.7-ubi 1.2.0-rc.6-ubi 1.2.0-rc.5-ubi 1.2.0-rc.18-ubi 1.2.0-rc.17-ubi 1.2.0-rc.16-ubi 1.2.0-rc.15-ubi 1.2.0-rc.14-ubi 1.2.0-rc.13-ubi 1.2.0-rc.12-ubi 1.2.0-rc.11-ubi 1.2.0-rc.10-ubi 1.2.0-rc.18 1.2.0-rc.17 1.2.0-rc.16 1.2.0-rc.15 1.2.0-rc.14 1.2.0-rc.13 1.2.0-rc.12 1.2.0-rc.11 1.2.0-rc.10 1.2.0-rc.9 1.2.0-rc.8 1.2.0-rc.7 1.2.0-rc.6 1.2.0-rc.5 1.2.0-rc.4 1.2.0-rc.3 1.2.0-rc.2 1.2.0-rc.1 1.1.2 1.1.2-ubi 1.1.2-rc.2-ubi 1.1.2-rc.1-ubi 1.1.2-rc.2 1.1.2-rc.1 1.1.1 1.1.1-rc.1 1.1.0 1.1.0-snapshot.uncommitted.fix.validation.webhook.96c3251 1.1.0-snapshot.uncommitted.feat.ingress.support.eadd668 1.1.0-snapshot.uncommitted.feat.ingress.support.bff2074 1.1.0-snapshot.uncommitted.feat.ingress.support.85ce4ac 1.1.0-snapshot.uncommitted.feat.ingress.support.24663fe 1.1.0-snapshot.uncommitted.feat.dynamic.accounts.e5cee25 1.1.0-snapshot.uncommitted.feat.dynamic.accounts.6ee7591 1.1.0-rc.13 1.1.0-rc.12 1.1.0-rc.11 1.1.0-rc.10 1.1.0-rc.9 1.1.0-rc.8 1.1.0-rc.7 1.1.0-rc.6 1.1.0-rc.5 1.1.0-rc.4 1.1.0-rc.3 1.1.0-rc.2 1.1.0-rc.1 1.0.3 1.0.3-rc.2 1.0.3-rc.1 1.0.2 1.0.2-rc.1 1.0.1 1.0.1-rc.1 1.0.0 1.0.0-rc.2 0.7.0-rc.1 0.6.1 0.6.1-rc.1 0.6.0 0.6.0-rc.4 0.6.0-rc.3 0.6.0-rc.2 0.6.0-rc.1 0.5.0 0.5.0-snapshot.uncommitted.feat.dynamic.accounts.e84d64b 0.5.0-snapshot.uncommitted.feat.dynamic.accounts.6ee7591 0.5.0-rc.12 0.5.0-rc.11 0.5.0-rc.10 0.5.0-rc.9 0.5.0-rc.8 0.5.0-rc.7 0.5.0-rc.6 0.5.0-rc.5 0.5.0-rc.4 0.5.0-rc.3 0.5.0-rc.2 0.4.0 0.3.3 0.3.2 0.3.0 0.2.0 0.1.0] 
DEBU[0084] expanded tags: [0444319 1.5.0-rc.1 1.5.0-rc.2 1.5.0-snapshot.fix.lambda.validation.b511501 8860284 latest]

where the config would be sth like

- name: armory/spinnaker-operator
  verbose: true
  mappings:
  - from: armory/spinnaker-operator
    to: someplace/dockerhub/armory/spinnaker-operator
    tags:
    - latest
    - 'semver: >=0.0.1'
    - 'keep: latest 5'
    platform: all

you can see with the current tag definitions i am getting these tags 0444319 1.5.0-rc.1 1.5.0-rc.2 1.5.0-snapshot.fix.lambda.validation.b511501 8860284 latest where what i expected would be sorted by source image updated time dev 1.5.0-rc.2 1.5.0-rc.1 1.3.1 1.4.0 so to sum this up, the question would be: can we set any filters based on most recent 5 images by the source upload time? or can i do some magic via a universal addition to all of my remaining images tags (regex/semver/keep?)

nia-potato avatar Feb 26 '24 12:02 nia-potato

where what i expected would be sorted by source image updated time

latest n does not work on upload time stamps, but on the actual semver versions. That is, to get the latest n tags, all semver compliant tags are sorted in descending version order (e.g. 1.2.3 is newer than 0.8.9) and the top n are picked. At what time they got created is irrelevant. So the result in your log seems about right. Only the inclusion of 0444319 and 8860284 is somewhat annoying. Apparently, the used semver lib considers them to be proper version numbers (e.g. 0444319.0.0). You could get rid of those by including an inverted regex rule: regex: ![0-9]+, which filters out plain numbers.

xelalexv avatar Feb 26 '24 14:02 xelalexv

gotcha, thank you @xelalexv for the explanation. However after testing this, it seems like i need to keep updating my inverted regex and need to adapt the rule to different type of images. is there any way from your perspective to just keep the latest 5 tags universally for all images in a config file? or is my only resort just to keep updating the inverting regex based on different source images tag formats.

nia-potato avatar Mar 04 '24 03:03 nia-potato

it seems like i need to keep updating my inverted regex and need to adapt the rule to different type of images

Could you give a couple of examples? I'm not sure I understand this.

is there any way from your perspective to just keep the latest 5 tags universally for all images in a config file?

Latest n by timestamp is not supported, only the semver mechanism. I am hesitant to implement something like that because the usefulness of tag timestamps is limited: A tag for an older version of an image can always be created in a repo, even if tags for newer versions already exist. Its timestamp would then be newer than the timestamps of these other tags, so sorting tags by timestamp can be misleading.

xelalexv avatar Mar 04 '24 06:03 xelalexv

Could you give a couple of examples? I'm not sure I understand this.

So lets say i use dregsy to bulk manage 500 public images in one config.yaml and every night i try to run a kubernetes cron job to make sure the current image repository contains the newest and latest image.

As you have stated in a similar fashion, there is no standardized fashion in what a image repo should define as tags, thus to get the latest N images i will need to do a initial fetch of the images with some sort of deny all rule for all non major/minor release looking tags based on descending order. (or some other logic to determine what are the latest major/minor release tags)

if we take spinnaker for an example again:

- name: armory/spinnaker-operator
  verbose: true
  mappings:
  - from: armory/spinnaker-operator
    to: someplace/dockerhub/armory/spinnaker-operator
    tags:
    - latest
    - 'semver: >=0.0.1'
    - 'regex: ![0-9]+'
    - 'keep: latest 5'
    platform: all

however with that approach yes you will get the image that contains latest, but you will also receive the image tags with images tags that contain latest, but they are only garbage tags from the past that the repo maintainer did not clean and only contained the keyword latest.

if we take trivy image for example, this is a image that contains lots .sigs in tags. and i would i have to include a inverted regex rule to disregard all tags that ends with a .sig

as you can see, unless i implement some sort of custom inference logic on which tags are actual valid newer tags(valid as in major/minor release tags), being able to keep up to date with image tags for a big registry is quite difficult in the current situation. (unless im misunderstanding something)

I am hesitant to implement something like that because the usefulness of tag timestamps is limited

i thought about this, and i completely agree with you if the answer is just, you would need to come up with a deny all rule with all "garbage tags" i totally understand, but even if a logic in keep parameter that supports keep: N highest descending tags then exclude all the high outlier numbers with a regex, i would still be able to get (spinnaker for ex) 1.5.0-rc2, 1.5.0-rc1, 1.4.1, 1.4.0, etc (if i dont add the exclude any rc regex)

nia-potato avatar Mar 04 '24 07:03 nia-potato

OK, I think I understand your use case. What we need here is something like strict semver. How about doing this:

tags:
  - latest
  - 'semver: >=0.0.1'
  - 'keep: v*[0-9]+\.[0-9]+\.[0-9]+'
  - 'keep: latest 5'

The regexp keep will reduce the semver set to include only pure semvers, i.e. those merely consisting of major.minor.patch, while allowing the v prefix. Any semver with a suffix will be dropped. In addition, the verbatim tag latest will also be included (remove this if not needed). However, note that this is an exact match, i.e. it only matches tag latest, not latest-foo.I hope this can solve the issue.

xelalexv avatar Mar 04 '24 08:03 xelalexv

I think you could even reduce this to:

tags:
  - latest
  - 'regex: v*[0-9]+\.[0-9]+\.[0-9]+'
  - 'keep: latest 5'

xelalexv avatar Mar 04 '24 08:03 xelalexv

that worked, thanks!

nia-potato avatar Mar 21 '24 23:03 nia-potato