PyBaMM
PyBaMM copied to clipboard
Infrastructure for nightly releases
I found some resources for our proposed nightly builds and hosting solutions plus options as to how we can configure them to work with our existing release infrastructure.
- Anaconda provides artifact storage as a custom PyPI repository and offers 5 GB of storage on a free plan. The paid plans start from $9/month. Note: this method will require the creation of our own Anaconda organisation since the
scientific-python
repository is not open for packages outside its ecosystem. Some examples of this are a. NumPy and SciPy store their weekly wheels in this repository: https://anaconda.org/scipy-wheels-nightly/repo b. Some packages under thescientific-python
umbrella (scikit-learn
,pandas
,matplotlib
,xarray
, et cetera) publish at https://anaconda.org/scientific-python-nightly-wheels/repo c. Anaconda’s own repository d. AstroPy used Azure earlier, now they use Anaconda. e. If we decide to use Anaconda, we can follow the Scientific Python SPEC-0004 which was specifically drafted for this - Cloudsmith.io provides an expensive artifact repository priced at $89 per user per month.
- Google Cloud Artifact Registry provides hosting and storage for Python packages. This is an enterprise solution so it might be private-only, I am not sure if it can be used for public, open-source projects.
- A Homebrew formula using Ruby for PyBaMM (this one is a mostly undocumented and niche solution and does not have many resources to seek. Requires submission to and acceptance from
homebrew-core
. Here is a tutorial I found: https://til.simonwillison.net/homebrew/packaging-python-cli-for-homebrew).cookiecutter
is an example of a Python package that has been packaged for Homebrew, since it is possible to install it withbrew install cookiecutter
. This solution would be available for Linux and macOS only, since Homebrew does not support Windows. - Nexus Repository from Sonatype has support for many packaging formats and supports proxying and hosting for PyPI packages. This might also be a private-only solution.
- Artifactory also provides a super expensive artifact registry plan priced at $150 per user per month.
- PEP 503 describes how we can make our own PyPI repository for indexing. Anaconda is compliant with this specification. Some self-hosted solutions that are compatible with this comprise a. devpi b. pypiserver, and a tutorial by Linode to accompany it c. Artipie, an open-source PyPI package repository d. Pulp is another self-hosted package index that can be used as described in this blog by Red Hat
- I looked at GitHub Packages (https://github.com/features/packages) and found it to be pretty cool, but it doesn’t support Python packages yet and probably will not. There have been discussions around this in their roadmap earlier, but they closed the feature request, sadly: https://github.com/github/roadmap/issues/94, possibly because Microsoft owns GitHub; so they have more precedence over what the folks at GitHub end up doing—plus I assume that they would like to keep Azure as the standard for a Python package registry, rather than offer a competing standard.
- However, related to point 7, we could create a PEP-503 compliant
pybamm-team/pybamm-nightly
repository and push wheels to it daily. The size of a PyBaMM installation is around 160 MB, sourced from (libraries.io). We could write a workflow to delete releases older than 30 days so that the size of the repository remains limited.scientific-python
has a workflow to do this in their Anaconda index. Though there is no real-time vulnerability detection in this case like how other commercial solutions provide, we have more control over our release infrastructure anyway and we can mitigate bad actors. It is easy to install a package from GitHub as well since we can usepip install git+
with the version from the release tag. There are many resources available, some of them I found are a. https://www.freecodecamp.org/news/how-to-use-github-as-a-pypi-server-1c3b0d07db2/ b. https://medium.com/network-letters/using-github-as-a-private-python-package-index-server-798a6e1cfdef c. An example of a PyPI index hosted as a GitHub repository: https://github.com/astariul/github-hosted-pypi which works with GitHub Releases. The total size of cumulative GitHub Releases has no limits, but individual releases have to be below 2 GB each, which is many margins above our release size. - The GitLab package registry supports PyPI packages. We could host a read-only mirror of PyBaMM on GitLab that gets updated with every commit to the
develop
andmain
branches. The better way would be to create a package registry there and write a GitLab CI pipeline to download artifacts from a GitHub Actions pipeline that uploads them, therefore starting a chain of CI/CD pipelines. Some resources (GitHub Actions in the marketplace, StackOverflow answers, and blogs) that might be useful in this case are listed below. a. https://github.com/marketplace/actions/trigger-gitlab-ci b. https://stackoverflow.com/questions/63308904/push-to-gitlab-with-access-token-using-github-actions c. https://github.com/marketplace/actions/trigger-gitlab-ci-through-webhooks d. https://github.com/marketplace/actions/trigger-gitlab-ci-job e. https://github.com/marketplace/actions/gitlab-pipeline-trigger f. https://dev.to/edersonbrilhante/gitlab-runners-as-a-service-with-github-action-149n g. https://www.anapaulagomes.me/2021/04/publishing-your-python-package-in-your-gitlab-package-registry/
We might need to ensure that the guide to downloading and using nightly releases is documented properly and warn unsuspecting users from using them. An edge case to take care of is that pip
does not fall back to using PyPI to download via the --extra-index
flag if a package is not found on the custom index which is a common modus operandi for dependency hijacking attacks. Source: https://discuss.python.org/t/advice-to-avoid-extra-index-url-to-install-private-packages-from-gitlab-ci/18242/11
I think either Anaconda or GitHub Releases would be the best methods overall. Both of them can be integrated with the existing release infrastructure very well
We are below the 5Gb limit so we can try the Anaconda free plan https://pypi.org/project/pybamm/#files
Hello! I work at Cloudsmith. :-) Small correction is that it's $89 flat pcm, not per user, at Cloudsmith.
Happy to help with questions!
Hi there @lskillen, does Cloudsmith's artifact management solution offer a free plan for technical open-source scientific projects like PyBaMM? The reason I ask this is because I found this resource in the Cloudsmith guides: https://help.cloudsmith.io/docs/open-source-hosting-policy, it would be great if we do qualify as an apposite project!
Hi there @lskillen, does Cloudsmith's artifact management solution offer a free plan for technical open-source scientific projects like PyBaMM? The reason I ask this is because I found this resource in the Cloudsmith guides: https://help.cloudsmith.io/docs/open-source-hosting-policy, it would be great if we do qualify as an apposite project!
If it's open-source, I don't see why? 😁 Generally we don't require pre-approval for OSS projects, just signup, create an OSS repository, add a license, and away you go. All we require is an attribution link to say we're providing it for you. Approval is only needed is you start to use significantly more bandwidth (as mentioned in the doc).