aptly
aptly copied to clipboard
New command to purge old versions
Here's a script that claims to call aptly repo remove once for each package with older versions that need removing, for a particular architecture. It relies on gnu sort's -V option, which sorts first by package name then by package version. It's really ugly, but it illustrates that "purge old versions" is nontrivial and might be worth adding as a feature in aptly itself.
#!/bin/sh
set -x
set -e
repo=_my_repo_
arch=amd64
dup=false
for p in `aptly repo search $repo "Architecture ($arch)" | sed "s/_$arch//" | sort -V`
do
pkg=`echo $p | sed 's,_.*,,'`
if test "$pkg" = "$pkg_old"
then
dup=true
elif $dup
then
dup=false
# $p_old is latest version of some package with more than one version
# Output a search spec for all versions older than this
# Version is 2nd field in output of aptly repo search, separated by _
v_old=`echo $p_old | cut -d_ -f2`
aptly repo remove $repo "$pkg_old (<< $v_old), Architecture ($arch)"
fi
p_old="$p"
pkg_old="$pkg"
done
For the automatic package publishing system I'm setting up (for a local Debian repository), this feature would be very useful, especially over long periods of time, as the CI/build server will churn out many versions of the packages.
:+1: I would love something like this.
This would be great to have implemented into aptly.
On top of that, this feature will be more useful if it allows the user to specify the amount of old packages to keep.
Selecting how much history to keep is a toughie. Three possibilities come to mind: max # of versions, max age, and max total bytes for all versions of a package. That might handle a lot of use cases, especially if they could be combined.
For my particular use cases, either max # of versions or max age would work.
I've come up with something like this:
# Removes old packages in the received repo
#
# $1: Repository
# $2: Architecture
# $3: Amount of packages to keep
repo-remove-old-packages() {
local repo=$1
local arch=$2
local keep=$3
for pkg in $(aptly repo search $repo "Architecture ($arch)" | grep -v "ERROR: no results" | sort -rV); do
local pkg_name=$(echo $pkg | cut -d_ -f1)
if [ "$pkg_name" != "$cur_pkg" ]; then
local count=0
local deleted=""
local cur_pkg="$pkg_name"
fi
test -n "$deleted" && continue
let count+=1
if [ $count -gt $keep ]; then
pkg_version=$(echo $pkg | cut -d_ -f2)
aptly repo remove $repo "Name ($pkg_name), Version (<= $pkg_version)"
deleted='yes'
fi
done
}
Note that the grep -v "ERROR: no results"
is due #334.
Issue with error messages going to stdout
had been fixed already in master
.
It would be nice to have an ability to keep some fixed number of versions let's say I want last 10 versions only, so I can roll back to some of them, but do not need to keep all of them.
something like (I know it is ugly, and this is ad-hoc one-liner) :
version=`aptly repo remove -dry-run=true $repo $package | sort --version-sort | grep $package | tail -n $number_to leave | head -1 | awk -F"_" '{print $2}'`
aptly repo remove $repo "$package ( << $version)"
UPD: just have noticed mistake in version filter
All other repository managers automatically expire old versions on upload of a new version - e.g. if I upload foo_1.0-2 then foo_1.0-1 is removed. aptly should at least optionally behave like this.
Hello, elaborating on @stumyp bash combo I created a Python script which performs (IMHO) the exact behaviour we'd like (I found a some issues with the bash version):
#!/usr/bin/env python2.7
import sys
from subprocess import check_output
from apt_pkg import version_compare, init_system
init_system()
repo = sys.argv[1]
package_name = sys.argv[2]
retain_how_many = int(sys.argv[3])
output = check_output(["aptly", "repo", "remove", "-dry-run=true", repo, package_name])
output = [line for line in output.split("\n") if line.startswith("[-]")]
output = [line.replace("[-] ","") for line in output]
output = [line.replace(" removed","") for line in output]
def sort_cmp(name1, name2):
version_and_build_1 = name1.split("_")[1]
version_and_build_2 = name2.split("_")[1]
return version_compare(version_and_build_1, version_and_build_2)
output.sort(cmp=sort_cmp)
should_delete = output[:-retain_how_many]
if should_delete:
print check_output(["aptly", "repo", "remove", repo] + should_delete)
else:
print "nothing to delete"
Since it's already in Python, if @smira is interested I could try submitting a pull request for integrating such functionality in aptly itself; any idea in how you'd like to command line? I'd probably create an "aptly repo" subcommand.
One of the issues that frequently pops up is that when one changes the version scheme or the package name, everything gets borked and all the old packages must be removed (sometimes). Can it deal with inconsistently versioned or named packages somehow?
Debian package versioning lets package maintainers cope with changing upstream version schemes by prefixing the version number with an epoch; see http://manpages.ubuntu.com/manpages/trusty/man5/deb-version.5.html
Because the python script uses from apt_pkg import version_compare to do its version comparisons, it's likely to handle that correctly.
On Mon, Nov 7, 2016 at 8:02 AM, figtrap [email protected] wrote:
One of the issues that frequently pops up is that when one changes the version scheme or the package name, everything gets borked and all the old packages must be removed (sometimes). Can it deal with inconsistently versioned or named packages somehow?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/smira/aptly/issues/291#issuecomment-258876651, or mute the thread https://github.com/notifications/unsubscribe-auth/AAKb4HDzMEa3CG1ZSsWFrBjSo4-aOz28ks5q70uEgaJpZM4F0QXZ .
Thank you, I totally forgot about the epoch.
Tim Kelley
On Mon, Nov 7, 2016 at 10:19 AM, Dan Kegel [email protected] wrote:
Debian package versioning lets package maintainers cope with changing upstream version schemes by prefixing the version number with an epoch; see http://manpages.ubuntu.com/manpages/trusty/man5/deb-version.5.html
Because the python script uses from apt_pkg import version_compare to do its version comparisons, it's likely to handle that correctly.
On Mon, Nov 7, 2016 at 8:02 AM, figtrap [email protected] wrote:
One of the issues that frequently pops up is that when one changes the version scheme or the package name, everything gets borked and all the old packages must be removed (sometimes). Can it deal with inconsistently versioned or named packages somehow?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/smira/aptly/issues/291#issuecomment-258876651, or mute the thread <https://github.com/notifications/unsubscribe-auth/ AAKb4HDzMEa3CG1ZSsWFrBjSo4-aOz28ks5q70uEgaJpZM4F0QXZ> .
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/smira/aptly/issues/291#issuecomment-258882014, or mute the thread https://github.com/notifications/unsubscribe-auth/AOhtauQLXabKhz3ii79gEZzqCpJSu3d8ks5q70-tgaJpZM4F0QXZ .
I added a few things to the script of @alanfranz, now it is possible to use package queries to remove old versions.
Example call:
./purge_old_versions.py --dry-run --repo release-repo --package-query 'Name (% ros-indigo-*)' -n 1
#!/usr/bin/env python
from __future__ import print_function
import argparse
import re
import sys
from apt_pkg import version_compare, init_system
from subprocess import check_output, CalledProcessError
class PurgeOldVersions:
def __init__(self):
self.args = self.parse_arguments()
if self.args.dry_run:
print("Run in dry mode, without actually deleting the packages.")
if not self.args.repo:
sys.exit("You must declare a repository with: --repo")
if not self.args.package_query:
sys.exit("You must declare a package query with: --package-query")
print("Remove " + self.args.package_query + " from " + self.args.repo +
" and keep the last " + str(self.args.retain_how_many) +
" packages")
@staticmethod
def parse_arguments():
parser = argparse.ArgumentParser(
formatter_class=argparse.RawTextHelpFormatter)
parser.add_argument("--dry-run", dest="dry_run",
help="List packages to remove without removing "
"them.", action="store_true")
parser.add_argument("--repo", dest="repo",
help="Which repository should be searched?",
type=str)
parser.add_argument("--package-query", dest="package_query",
help="Which packages should be removed?\n"
"e.g.\n"
" - Single package: ros-indigo-rbdl.\n"
" - Query: 'Name (%% ros-indigo-*)' "
"to match all ros-indigo packages. See \n"
"https://www.aptly.info/doc/feature/query/",
type=str)
parser.add_argument("-n", "--retain-how-many", dest="retain_how_many",
help="How many package versions should be kept?",
type=int, default=1)
return parser.parse_args()
def get_packages(self):
init_system()
packages = []
try:
output = check_output(["aptly", "repo", "remove", "-dry-run=true",
self.args.repo, self.args.package_query])
output = [line for line in output.split("\n") if
line.startswith("[-]")]
output = [line.replace("[-] ", "") for line in output]
for p in output:
packages.append(
re.sub("[_](\d{1,}[:])?\d{1,}[.]\d{1,}[.]\d{1,}[-](.*)", '', p))
packages = list(set(packages))
packages.sort()
except CalledProcessError as e:
print(e)
finally:
return packages
def purge(self):
init_system()
packages = self.get_packages()
if not packages:
sys.exit("No packages to remove.")
# Initial call to print 0% progress
i = 0
l = len(packages)
printProgressBar(i, l, prefix='Progress:', suffix='Complete', length=50)
packages_to_remove = []
for package in packages:
try:
output = check_output(["aptly", "repo", "remove",
"-dry-run=true", self.args.repo,
package])
output = [line for line in output.split("\n") if
line.startswith("[-]")]
output = [line.replace("[-] ", "") for line in output]
output = [line.replace(" removed", "") for line in output]
def sort_cmp(name1, name2):
version_and_build_1 = name1.split("_")[1]
version_and_build_2 = name2.split("_")[1]
return version_compare(version_and_build_1,
version_and_build_2)
output.sort(cmp=sort_cmp)
should_delete = output[:-self.args.retain_how_many]
packages_to_remove += should_delete
i += 1
printProgressBar(i, l, prefix='Progress:', suffix='Complete',
length=100)
except CalledProcessError as e:
print(e)
print(" ")
if self.args.dry_run:
print("\nThis packages would be deleted:")
for p in packages_to_remove:
print(p)
else:
if packages_to_remove:
print(check_output(["aptly", "repo", "remove",
self.args.repo] + packages_to_remove))
print("\nRun 'aptly publish update ...' "
"to update the repository.")
else:
print("nothing to remove")
# Print iterations progress
def printProgressBar(iteration, total, prefix='', suffix='', decimals=1,
length=100, fill='#'):
"""
Call in a loop to create terminal progress bar
@params:
iteration - Required : current iteration (Int)
total - Required : total iterations (Int)
prefix - Optional : prefix string (Str)
suffix - Optional : suffix string (Str)
decimals - Optional : positive number of decimals in percent
complete (Int)
length - Optional : character length of bar (Int)
fill - Optional : bar fill character (Str)
"""
percent = ("{0:." + str(decimals) + "f}").format(
100 * (iteration / float(total)))
filled_length = int(length * iteration // total)
bar = fill * filled_length + '-' * (length - filled_length)
print('\r%s |%s| %s%% %s' % (prefix, bar, percent, suffix), end='\r')
# Print New Line on Complete
if iteration == total:
print()
if __name__ == '__main__':
purge_old_versions = PurgeOldVersions()
purge_old_versions.purge()
I had feature in the works which I never got to completion as it requires some large scale changes, but the idea was to enhance package queries with Python-like slice syntax, so that you could do package[3:]
which would mean "all the first 3 versions of package".
@samuelba Thanks for the script but it does not work properly with query like Name (% *test), Version(% *dev) it lists all packages for deletion like ignoring the Version filter, normal aptly command works without a problem with such query so i had to revert back to plain old bash hacking
The feature mentioned by @smira would be tremendously useful for maintaining repositories that can accrue a large number of different versions. I was wondering if there has been any progress on this in the last couple of months?
No progress so far on that, I have branch which implements part of the syntax, but nothing more.
This thread helped me a lot. Here is my take on the issue based on what I've read here. Hope it will be useful.
#!/usr/bin/env python3
import sys
import json
import codecs
import mimetypes
import uuid
import io
import re
from pathlib import Path
from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError
from functools import cmp_to_key
from apt_pkg import version_compare, init_system
init_system()
class MultipartFormdataEncoder(object):
def __init__(self):
self.boundary = uuid.uuid4().hex
self.content_type = 'multipart/form-data; boundary={}'.format(self.boundary)
def iter(self, files):
encoder = codecs.getencoder('utf-8')
for file in files:
print('uploading file %s...' % str(file))
yield encoder('--{}\r\n'.format(self.boundary))
yield encoder('Content-Disposition: form-data; name="{}"; filename="{}"\r\n'.format(file.name, file.name))
yield encoder('Content-Type: {}\r\n'.format(mimetypes.guess_type(file.name)[0] or 'application/octet-stream'))
yield encoder('\r\n')
with open(str(file), 'rb') as fd:
buff = fd.read()
yield (buff, len(buff))
yield encoder('\r\n')
yield encoder('--{}--\r\n'.format(self.boundary))
def encode(self, files):
body = io.BytesIO()
for chunk, chunk_len in self.iter(files):
body.write(chunk)
return self.content_type, body.getvalue()
def sort_cmp(p1, p2):
v1 = p1.split(' ')[2]
v2 = p2.split(' ')[2]
return version_compare(v1, v2)
def request(url, method='GET', data=None, files=None):
headers = {'Content-Type': 'application/json'}
if data is not None:
data = json.dumps(data).encode('utf-8')
if files is not None:
content_type, data = MultipartFormdataEncoder().encode(files)
headers = {'Content-Type': content_type}
req = Request(url, data, headers)
req.get_method = lambda: method
try:
response = urlopen(req)
except HTTPError as e:
print('the server couldn\'t fulfill the request.')
print('error code: ', e.code)
except URLError as e:
print('failed to reach a server.')
print('reason: ', e.reason)
else:
rep = json.loads(response.read().decode('utf-8'))
return rep
def purge(url, repo, name, retain_how_many):
data = request(url+'/api/repos/'+repo+'/packages')
data = list(filter(lambda x: x.split(' ')[1]==name, data))
data = sorted(data, key=cmp_to_key(sort_cmp))
should_delete = data[:-retain_how_many]
if should_delete:
print('the following packages are going to be removed from %s: %s' % (repo, should_delete))
data = {'PackageRefs': should_delete}
rep = request(url+'/api/repos/'+repo+'/packages', method='DELETE', data=data)
else:
print('no version of %s deleted in %s' % (name, repo))
def main():
url = sys.argv[1]
repo_pattern = re.compile(sys.argv[2])
package_glob = sys.argv[3]
retain_how_many = int(sys.argv[4])
directory = str(uuid.uuid4())
# Upload packages
packages = list(Path('.').glob(package_glob))
print('uploading %s packages in directory %s' % (len(packages), directory))
request(url+'/api/files/'+directory, method='POST', files=packages)
# List repos matching repo_pattern
repos = [r['Name'] for r in request(url+'/api/repos')]
repos = [r for r in repos if repo_pattern.match(r)]
print("pattern matches the following repositories: %s" % repos)
names = {file.name.split('_')[0] for file in packages}
for repo in repos:
# Add package to repo
rep = request(url+'/api/repos/'+repo+'/file/'+directory+'?noRemove=1', method='POST')
# Delete old package
for name in names:
purge(url, repo, name, retain_how_many)
# Delete upload directory
request(url+'/api/files/'+directory, method='DELETE')
if __name__ == '__main__':
main()
Usage: ./aptly-push <http://APTLYAPI> <REPOPATTERN> <PATH> <RETAINHOWMANY>
It will upload all packages matching the PATH
glob and add them to all the repos matching the REPOPATTERN
. For each repo and for each package, it then limits the number of versions to RETAINHOWMANY
.
Example: ./aptly-push http://127.0.0.1:9876 "myrepo-(?:prod|staging)" "./build/*.deb" 3
And yet another version, based on the version of @samuelba We have 8 repos with each 2 or 4 components, 3 to 4 architectures and some 100 packages. While samuelbas version worked nicely (after porting from python2 to python3) it took about 10 minutes to purge all of them. So instead of painting a progress bar, this one should be fast enough to not need one :)
#!/usr/bin/env python3
from __future__ import print_function
import argparse
import re
import sys
from apt_pkg import version_compare, init_system
from subprocess import check_output, CalledProcessError
from functools import cmp_to_key
class PurgeOldVersions:
def __init__(self):
self.args = self.parse_arguments()
if self.args.dry_run:
print("Running in dry mode, without actually deleting the packages.")
if not self.args.repo:
sys.exit("You must declare a repository with: --repo")
if not self.args.package_query:
sys.exit("You must declare a package query with: --package-query")
print("Removing " + self.args.package_query + " from " + self.args.repo +
" and keeping the last " + str(self.args.retain_how_many) +
" packages")
@staticmethod
def parse_arguments():
parser = argparse.ArgumentParser(
formatter_class=argparse.RawTextHelpFormatter)
parser.add_argument("--dry-run", dest="dry_run",
help="List packages to remove without removing "
"them.", action="store_true")
parser.add_argument("--repo", dest="repo",
help="Which repository should be searched?",
type=str)
parser.add_argument("--package-query", dest="package_query",
help="Which packages should be removed?\n"
"e.g.\n"
" - Single package: ros-indigo-rbdl.\n"
" - Query: 'Name (%% ros-indigo-*)' "
"to match all ros-indigo packages. See \n"
"https://www.aptly.info/doc/feature/query/",
type=str)
parser.add_argument("-n", "--retain-how-many", dest="retain_how_many",
help="How many package versions should be kept?",
type=int, default=1)
return parser.parse_args()
def get_packages(self):
init_system()
packages = {}
try:
print("getting packages %s" % self.args.package_query)
output = check_output(["aptly", "repo", "remove", "-dry-run",
self.args.repo, self.args.package_query]).decode('utf-8')
output = [line for line in output.splitlines() if
line.startswith("[-]")]
output = [line.replace("[-] ", "") for line in output]
output = [line.replace(" removed", "") for line in output]
for p in output:
packageName = p.split("_")[0]
version = p.split("_")[1]
arch = p.split("_")[2]
if packageName not in packages:
packages[packageName] = {}
if arch not in packages[packageName]:
packages[packageName][arch] = []
packages[packageName][arch].append(version)
except CalledProcessError as e:
print(e)
finally:
return packages
def purge(self):
init_system()
packages = self.get_packages()
packagesToRemove = []
for package in packages:
for arch in packages[package]:
versions = packages[package][arch]
versions = sorted(versions, key=cmp_to_key(version_compare))
versionsToRemove = versions[:-self.args.retain_how_many]
for versionToRemove in versionsToRemove:
packagesToRemove.append("%s_%s_%s" % (package, versionToRemove, arch))
if len(packagesToRemove) == 0:
sys.exit("No packages to remove.")
if self.args.dry_run:
print(check_output(["aptly", "repo", "remove", "-dry-run", self.args.repo] + packagesToRemove).decode("utf-8"))
else:
print(check_output(["aptly", "repo", "remove", self.args.repo] + packagesToRemove).decode("utf-8"))
if __name__ == '__main__':
purge_old_versions = PurgeOldVersions()
purge_old_versions.purge()
Could we get some guidance on the requirements for what would be required for a 3rd party to implement this.