delete-package-versions
delete-package-versions copied to clipboard
[help appreciated] Support multi arch packages / Solving "Manifest Unknown" error
Fixes https://github.com/actions/delete-package-versions/issues/90
Hi @cwille97 , @takost apologies for tagging you in this. I saw you have recently contributed to this project, and I was hoping that we can together solve the issue above.
The problem we want to solve is that for multi-architecture packages, the current "delete-package-versions" action does not work, since it kills the individual packages of the architectures, and keeping only the "parent" shell that points to the respective packages.
Example:
https://github.com/ManiMatter/decluttarr/pkgs/container/decluttarr/204442395?tag=v1.38.0
Running the action would keep the main package v1.38.0 (sha256:b4a9b04d8c0a5ab9f400f7f64f8be20d9951a996fd00882a936087af8f5ce43d)
but it would lose the 3 Sub-IDs:
linux/amd64: sha256:c2dfb515fd9a6ad396fe6a48cd3e535b4079b467cb691bcb3faede6889089d6e
linux/arm64: sha256:59b2aa2e04cc6b3391f612833e87bbd0c4fdfddb04845b8e8f0365a45e90151c
unknown/unknown: sha256:6dfc07ab69cbe95303f51fed14b40a9574bbebbb3501d7aec481d184a8321c91
On pulling the main package, the user would then get "manifest unknown" error
I have an idea how it could be fixed, and it does not seem to be crazy difficult conceptually. Unfortunately gitActions and typescript are not my forte, and what I have prepared so far is a "pseudo code".
My idea would be that we enhance the code on two points:
- When we fetch the package information, we also fetch a list of subIDs contained in the package (e.g. for a parent package it would have the package IDs of the underlying architectures; for the architecture packages that subID would be empty)
- We change the function finalIds as proposed in my code, which would "shield" those packages from deletion that are not tagged but are part of a parent package that is tagged
It would be fantastic if you chimed in with your skills :)
Let me know what you think
Typescript is also not my forte, but is your pseudo code assuming that it's only removing multi-arch packages that are untagged? Because if somebody was running it to keep only 5 remaining, it may want to remove tagged multi-arch packages, in which case we'd also want to remove the untagged parts too.
What if, as a different approach, it just refuses to kill a tag if it's in use from a multi-arch image? So a user could still do two rounds:
jobs:
delete-untagged-old-packages:
runs-on: ubuntu-latest
steps:
- uses: actions/delete-package-versions@v4
with:
delete-only-untagged-versions: 'true'
min-versions-to-keep: 25
package-name: ${{ github.event.repository.name }}
package-type: container
delete-tagged-old-packages:
runs-on: ubuntu-latest
steps:
- uses: actions/delete-package-versions@v4
with:
min-versions-to-keep: 25
package-name: ${{ github.event.repository.name }}
package-type: container
Where the second round will make sure there's always only 25 tagged versions in the registry. And the first round will basically only keep the latest 25 untagged versions, but if it tries to delete an untagged version (say version 30) that's part of a multi-arch image that's laying around, then it will not delete version 30. What this would mean is that after running the first job, there could be over 25 leftover, but the ones over that 25 are all still in-use.
What this would essentially be is at the beginning, querying every single image for the manifest, and if it contains manifests, then saving the manifests to an array. And then when deleting each untagged image, it would simply block and move on if the digest was in that array.
Thank you for the thoughts, @emmahsax
I have adjusted the code to support "minVersionsToKeep" and "numOldVersionsToDelete".
My approach here was:
-
Ignore subpackages when determining which packages do delete / retain, and only look at parent packages (parent packages being any packages that are not subpackages, thus the setup works also when no multi-arch setup is used; there everything is a parent package)
-
After having determined which parent packages to eliminate, then check for the subpackages, whether they belong to parent packages flagged for deletion and delete these subpackages together with the parent packages
Still, it's pseudocode, and the piece I have no clue how to code it is the "subIds" list (essentially the manifest ids) in get-versions.ts. I also do not have experience how to run it so it could be tested.
Update: I was able to query the github API to get the package versions. My problem is that the response does not include the sub-package version IDs.
Example: Trying to fetch data from https://api.github.com/users/manimatter/packages/container/decluttarr/versions/204488276 as per API definition: /user/packages/{package_type}/{package_name}/versions/{package_version_id} returns:
{
id: 204488276,
name: 'sha256:44a623848836eb4629e22849eab526fffe4eb823d43a94aa1aef70d3dbed3c2a',
url: 'https://api.github.com/users/ManiMatter/packages/container/decluttarr/versions/204488276',
package_html_url: 'https://github.com/users/ManiMatter/packages/container/package/decluttarr',
created_at: '2024-04-16T16:32:06Z',
updated_at: '2024-04-16T16:32:06Z',
html_url: 'https://github.com/users/ManiMatter/packages/container/decluttarr/204488276',
metadata: { package_type: 'container', container: { tags: [Array] } }
}
What I am missing here is the link to the other package ids (sub-package ids), which is visible when I navigate to the respective page via browser. https://github.com/ManiMatter/decluttarr/pkgs/container/decluttarr/204488276
Any idea how to get the sub-ids of 204488276 returned by an API call? For linux/amd64, I'd expect to see:
id: 204488239,
name: 'sha256:aa737eb74277e312c218a6ee2a53659b9b5d736287d5b285623ebbc30652e2ff',
Any help appreciated - started a discussion here in the meantime
Stopping development on this, since covered by this https://github.com/dataaxiom/ghcr-cleanup-action
I have also been working on https://github.com/emmahsax/action-ghcr-prune. This is call-able like this:
jobs:
delete-old-packages:
runs-on: ubuntu-latest
steps:
# Delete all tagged packages except `latest` and the last other 5 (so 6 tagged packages left over
# at the end). Also delete the untagged manifests of the tagged multi-arch packages that are deleted.
- name: Delete Old Tagged Packages
uses: emmahsax/action-ghcr-prune@main
with:
container: ${{ github.event.repository.name }}
dry-run: true
keep-last: 5
keep-tags-regexes: |
^latest$
organization: ${{ github.repository_owner }}
prune-tags-regexes: |
.+
remove-multi-platform: true
token: ${{ secrets.GITHUB_TOKEN }}
# After the first step finishes, then go through and delete all other untagged packages
# UNLESS they are a part of a multi-arch image that is left around from the first step.
- name: Delete Old Untagged Packages
uses: emmahsax/action-ghcr-prune@main
with:
container: ${{ github.event.repository.name }}
dry-run: true
keep-last: 0
organization: ${{ github.repository_owner }}
prune-untagged: true
token: ${{ secrets.GITHUB_TOKEN }}
I would love though for this official action to eventually pick up multi-arch support :(