SHPC New Feature - Upgrade Functionality
Hello @vsoch,
I am Ausbeth Aguguo, a collaborator with the Wellcome Sanger Institute. I contributed to this amazing project under the guidance of Matthieu Muffato, Guoying Qi and the Tree of Life Informatics Infrastructure Team.
In this pull request, I am introducing a robust functionality for SHPC, which uses shpc upgrade command to implement an intuitive upgrade logic that helps users manage installed and available software versions within SHPC. Here’s a breakdown:
shpc list
quay.io/biocontainers/samtools:1.20--h50ea8bc_0
quay.io/biocontainers/biocode:0.10.0--pyhdfd78af_0
User performs shpc upgrade quay.io/biocontainers/samtools
It checks the registry for the latest version of samtools available. If the user has the latest version installed among all the versions of samtools in the user’s list, shpc will just give the user a message that samtools is up to date. Else, it will perform an upgrade by installing the latest version.
User performs shpc upgrade -- all
It will run a check through the user’s list and perform an upgrade on all the user’s outdated software if any.
User performs shpc upgrade quay.io/biocontainers/samtools -- dry-run
It will inform the user if the latest version of samtools is installed or available to install, without performing an upgrade
User performs shpc upgrade -- all -- dry-run
This will display a list of installed software the user has. From the list, it indicates each outdated software and updated software, without upgrading the outdated ones.
Additional Information:
-
shpc upgradecannot be performed when a user’s software list is empty. However, it will prompt the user to install the software the user tried to upgrade. -
shpc upgradegives the user the option to either uninstall or preserve previous versions of software during the upgrade process. -
shpc upgradegives the user the option to either install or not install the latest version of software to the views the previous versions were installed in -
shpc upgradedoes not allow the user to include the version of a software as a recipe, this is because the goal is to check the software itself and install the latest version available. It does not upgrade the specific version of the software. Therefore, this command is invalid: 'shpc upgrade quay.io/biocontainers/samtools:1.20--h50ea8bc_0' -
shpc upgrade does not allow these argument combinations, to maintain clarity in command execution:
shpc upgrade quay.io/biocontainers/samtools -- all
shpc upgrade quay.io/biocontainers/samtools --all --dry-run
Finally, this upgrade functionality has been tested locally, including both manual tests and unit tests.
Thanks @Ausbeth ! This introduces a confusing point for the user - having "update" and "upgrade" (that have different interactions). What exactly is upgrade doing that update is not, and did you think about a way to consolidate the need?
It might help to start at the beginning and tell me the problem you needed to solve.
Hi @vsoch, currently in SHPC, a user who wants to install the latest version of one of their installed software must manually perform the entire version-checking and installation process. This is rather time consuming as the user may potentially have to perform three commands to achieve their aim: shpc show quay.io/biocontainers/samtools – to view the latest tag, shpc list quay.io/biocontainers/samtools – to ensure the latest is installed, shpc install quay.io/biocontainers/samtools:latest – to install the latest. Incidentally, the user could attempt to install the latest version by doing shpc install quay.io/biocontainers/samtools, however, this could potentially reinstall the software version because the latest may already be installed.
The new upgrade feature aimed to solve this problem by automating the process. It checks the container.yaml recipe for the latest version of software the user has installed. If there is a new “latest” and the user does not have it installed, shpc upgrade will automatically install that version for the user, else, it just gives a helpful message to the user to reassure them that their software is up to date and no upgrade is required. Additionally, if an actual upgrade occurs, the user is given the option to either uninstall or preserve older versions of the upgraded software. shpc upgrade adds a layer of user convenience and ensures users seamlessly install the latest versions of software with minimal effort.
This is different from shpc update. “Update” is concerned with updating container recipes by fetching new metadata for container images from the registry and updating the configuration files accordingly. It does not interact with the software the user has installed. “Upgrade”, on the other hand, complements “Update” by directly interacting with the installed software on the user’s system, ensuring they have the latest versions, based on their latest tags in the container.yaml, installed.
I don't think I'm convinced by needing this addition, especially if (as you say) the same can be accomplished with an shpc install. This would add a lot of confusion for not a lot of new functionality. If this is a problem:
however, this could potentially reinstall the software version because the latest may already be installed.
That is probably what you want to work on - if install is doing a reinstall when latest is already installed, then it should not.
Hi @vsoch . The use case is still the same as in https://github.com/singularityhub/singularity-hpc/issues/501#issuecomment-1065818554 . Following up on
I figured most folks would install the latest and call it a day, and pull once in a while for new containers or versions.
For people using the remote registry, the container.yaml files are automatically updated in the background (yeah github actions !). shpc upgrade can replace software with the latest version. Users don't need to think about versions and cleaning up the old installations. It just keeps everything up to date. Also, for convenience, new versions are directly added onto the views of the versions they replace.
In our case, we keep a local copy of the registry with only the software and versions we want installed, but the principle remains the same. At regular intervals we update the "latest" tags in all container.yaml and we'd use shpc upgrade to replace software with the latest versions.
Under the hood, yes, shpc/client/upgrade.py is just wrapping the existing functions shpc list, shpc view list, shpc uninstall, shpc install, shpc view install. I would rather see the functionality properly, safely, implemented and tested within shpc rather than letting users / admins write that up in a shell script.
Thanks for the reference! I think that discussion was still oriented around the (current) update comment. For this bit:
For people using the remote registry, the container.yaml files are automatically updated in the background (yeah github actions !). shpc upgrade can replace software with the latest version. Users don't need to think about versions and cleaning up the old installations. It just keeps everything up to date. Also, for convenience, new versions are directly added onto the views of the versions they replace.
I like the idea, but what I don't like is having update and upgrade. Ubuntu / debian has that with apt and it's generally confusing. I'd be open to other proposals to get that same functionality. This would warrant something more simple like:
shpc install --upgrade
Which scopes it under install (where I think it should be, since it's essentially doing install for new versions?) and explicitly says "install the latest (upgrade) for my modules. I'm open to other ideas too - I just don't like having update and upgrade that makes the library confusing.
Is this the interface you're proposing ?
shpc install software[:version] # Install the software, either the latest version or the one specified
shpc install software --upgrade [--dry-run] # if the latest version is not installed, install it and uninstall all previous ones - preserving the views
shpc install --all --upgrade [--dry-run] # Like a loop on `shpc install software --upgrade` for all installed software
Since you're suggesting expanding the install function, we should add that @Ausbeth is also finalising a reinstall command that reinstalls all installed software. This is useful when having to regenerate modules/wrapper scripts because the template changed. The current interface is:
shpc reinstall software:version
shpc reinstall software
shpc reinstall --all
Let's start with just the install set - I'm not sure about the latter. For this one:
shpc install software --upgrade [--dry-run]
Why would we need to have --upgrade if the install (without it) would install the latest? I understand the need to have concise commands to do multi-step actions, but I don't want to add new functionality without thinking it through first.
shpc install software 1) doesn't uninstall older versions, 2) doesn't replicate the views of previously installed versions
The analogy is homebrew, which offers:
brew list [FORMULA|CASK...]
brew install FORMULA|CASK...
brew reinstall FORMULA|CASK...
brew uninstall FORMULA|CASK...
brew update
brew upgrade [FORMULA|CASK...]
shpc already has list [software], install software, uninstall software and update. We're seeking to add upgrade [software] and reinstall [software].
hmm :/ @marcodelapierre what do you think?
Hi there :)
I have read the thread, and I like this landing point:
shpc install software --upgrade [--dry-run] # if the latest version is not installed, install it and uninstall all previous ones - preserving the views
shpc install --all --upgrade [--dry-run] # Like a loop on `shpc install software --upgrade` for all installed software
I see the value in providing an automated way to check for SHASUM updates of given versions and install them, for one or better many installed images.
Side note: in general, this may also be useful also for non-latest versions, in those (rare) cases where the SHASUM is updated for the same tag, due to critical issues with the original build. Right?
I like how --upgrade it is just a flag to install, rather than a new sub-command, to keep the key interface cleaner.
Questions:
- If the old image is uninstalled, then views should always be updated right? So no dedicated flag for this option is needed. (if I understand correctly this detail of the functionality)
- Is there a simple way to undo this operation? Suppose the commands is run by accident, or the outcome is undesired. This can be particularly critical for the
--allcase. Or how can we safeguard this scenario?
Finally, I would consider also having --reinstall (maybe --rerun, or --force/-f?) as a flag for install. It's verbose and repetitive, but I am not sure we need an extra subcommand for this operation.
One more thought.
If the registry in use is already updated (as it should -- better to have the registry sync separate, right?), how are upgrade and reinstall differentiated?
Ultimately, isn't it the same functionality?
There could be a "module only"-like flag to just re-generate module and wrapper scripts.
If the old image is uninstalled, then views should always be updated right? So no dedicated flag for this option is needed.
The new image will be added to the same views from which the old images were uninstalled. Indeed, this is transparent for the user and shouldn't need any flag.
Is there a simple way to undo this operation? Suppose the commands is run by accident, or the outcome is undesired. This can be particularly critical for the --all case. Or how can we safeguard this scenario?
First, there's the --dry-run option to show what the command would do. Then, unless --force is given, shpc still prompts the user for all the actions, just like uninstall and install do.
If the registry in use is already updated (as it should -- better to have the registry sync separate, right?), how are
upgradeandreinstalldifferentiated? Ultimately, isn't it the same functionality?
They're different in our current implementations:
- If the software is already the latest version:
reinstalluninstalls+installs it again,upgradedoes nothing. - If the software is in a previous version:
reinstalluninstalls+installs it again (still the old version),upgradeuninstalls it and installs the latest version instead.
Sounds like we have a path forward! Let's start with:
shpc install software --upgrade [--dry-run]
Thanks for your suggestions, @vsoch, @muffato, and @marcodelapierre. Before going ahead, I have some undying questions I would like to address.
shpc install software --upgrade [--dry-run] # if the latest version is not installed, install it and uninstall all previous ones - preserving the views
-
The uninstallation of previous versions and the replication of the views (adding the latest version to the views of the previous versions) are still actions that prompt the user before taking effect right? Which is necessary for safety reasons, but assuming the user doesn't want the prompt and just wants to force the actions, the user includes the
--forcetag. However,shpc install software --forcealready exists, and from its description, it "replaces existing symlinks". But this actually does nothing becauseforceis not implemented in theinstallfunction. So, do we ignore the initial intentions forshpc install software --forceand makeshpc install software --upgrade --forcefocus solely on ignoring the prompts for uninstalling previous versions and replicating the views? And if so, what should actually happen when--forceis used without--upgrade? -
shpc install software:versioninstalls that specific version, but what happens if the user doesshpc install software:version --upgrade. Should this raise an error? For "upgrade",shpc upgrade software:versionraises an error because the goal was to check the software itself and install the latest version available, rather than upgrading a specific version, which the user is made aware of in the help description. My concern is if the user wonders why they are able to useshpc install software:versionas a recipe forinstallbut suddenly can't use it when they include the--upgradeflag.
- If force is not used in install (outside of the view), I would do the implementation without force for now, and then, come back to do work just on force - adding/remove where we think makes sense.
- Upgrade should not be allowed with a version.
@vsoch @muffato @marcodelapierre My apologies for the slow progress, I had a hectic schedule these past two weeks. However, I have completed:
shpc install software --upgrade [--dry-run]