cog
cog copied to clipboard
[#567] Use pinned versions for `apt-get` packages
Based on Dockerfile best practices:
Version pinning forces the build to retrieve a particular version regardless of what’s in the cache. This technique can also reduce failures due to unanticipated changes in required packages. https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#apt-get
Signed-off-by: Iván Perdomo [email protected]
Amazing, thank you so much!
One problem I've noticed with pinning apt dependencies is old versions sometimes get removed from the repositories. I wonder what Ubuntu's policy is about this, and whether this will remain reliable.
We should also totally have an yarn add style thing for adding this with the current version. 🤔
One problem I've noticed with pinning apt dependencies is old versions sometimes get removed from the repositories. I wonder what Ubuntu's policy is about this, and whether this will remain reliable.
AFAIK, Alpine Linux is the distribution that removes old packages. That's why they introduced =~ when matching a package version in apk, e.g. apk add --no-cache curl~=7
I just tested a 2018 Dockerfile with a pinned package version and still builds successfully, e.g.
https://github.com/akvo/akvo-dockerfiles/blob/3410c095cd55db7830b21bf0fc427eb239e6109f/run-as-user/Dockerfile#L6=
I was based on debian:strech not ubuntu base image.
I've searched around but I couldn't find any documentation on their policy.
I reached to a friend that is closer to the Debian community and knows more about the package management part and he suggested that:
- By default old packages will not be reachable, e.g. Unbuntu 18.04 libglib2.0-0, only has some versions available - https://packages.ubuntu.com/bionic/libglib2.0-0, not all the history of versions http://changelogs.ubuntu.com/changelogs/pool/main/g/glib2.0/glib2.0_2.56.4-0ubuntu0.18.04.9/changelog
- By pinning the package to a specific version you implicitly discard security updates and minor fixes
- While it's true that you loose reproducibility, the fact that you get security updates outweighs the potential problem of an update breaking your code
- If you use an LTS version, the packages updates are quite stable
I can update to the PR to use versions like libglib2.0-0=2.* (it's possible), or you can just ignore this changes and close the PR without merging.
Ah, interesting. I wonder if we can do something clever at the Cog level. E.g. glib2.0==2.56.4 expands to glib2.0==2.56.4-* or something. Or maybe we could have a separate lock file.
My feeling is we might want to have a way of adding these versions automatically somehow for users before we recommend it, because finding and specifying the correct wildcard version is rather difficult.
Perhaps an alternative solution for this might be to encourage users to pin to a particular Ubuntu version. For stable Ubuntu versions, the system package versions will also remain stable. I've recorded this here: #575
@iperdomo Really appreciate the contribution here, but I'm going to close this because this does seem this is more complex than just using pinned versions like this. I'm going to close this and we can continue the discussion in #567.
Thank you for helping us figure this out! 😄
I'd like to reopen this PR for consideration.
Although there are some complexities involved with pinning system packages, the motivations are sound, and I think this is something we should do — and not just for user-specified system packages, but for anything installed in Cog's base image, too.
GitHub recently made it easier to report package dependencies in repos (https://github.com/replicate/cog/issues/1185), which helps resolve the central dilemma of pinning, which is that pinned dependencies can't be updated. If Cog users got Dependabot Alerts when vulnerabilities were reported, then I think that'd be really compelling.