datahub
datahub copied to clipboard
audit datahub `environment.yml` files for correctness
Bug description
this will pertain only to the conda envs, not pip.
looking at datahub/deployments/datahub/images/default/envionment.yml
, i discovered a wide range of how we define package versions. all of this variation can potentially create confusion and behave unexpectedly. package=version
is different than package==version
and in fact =
puts a wildcard after package
.
https://conda.io/projects/conda/en/latest/user-guide/concepts/pkg-specs.html#package-match-specifications
for all package requests that have a specific version suppled, i would strongly recommend we move to ==
to pin exactly to that version. for those packages with wildcards in their versioning, leave the =
operator as is.
we should check these out, and clean up/fix if necessary:
./deployments/stat20/image/environment.yml
./deployments/datahub/images/default/environment.yml
./deployments/stat159/image/environment.yml
./deployments/biology/image/environment.yml
./deployments/data8/image/environment.yml
./deployments/cee/image/environment.yml
./deployments/a11y/image/environment.yml
./deployments/data100/image/environment.yml
./deployments/eecs/image/environment.yml
./deployments/data8xv2/image/environment.yml
./deployments/data8x/image/environment.yml
./deployments/astro/image/environment.yml
./deployments/julia/image/environment.yml
Environment & setup
all hubs
How to reproduce
here be dragons
I totally agree. The versions fuzzified by =
will have to be reconciled with what actually got installed in those images and then converted to ==
.
The downside may be that fuzziness is actually necessary to resolve some dependencies, and we have to see how it all turns out in the built image.