pygradle
pygradle copied to clipboard
Definition of an explicit installation sequence of dependencies?
Hi all,
this question might be more related to Gradle than to pygradle, so please redirect me if necessary. Anyhow, I think its worth discussing in the context of pygradle here.
Currently, I am trying to build a minimal working example in the setting of Machine Learning (say, a getting started project on the Iris dataset) using:
- Docker as a build container
- pygradle to build my python project
- pivy-importer to have a locally cached pypi repository (I consider moving that to Artifactory in the future, but since we don't own a Pro instance here, I am stuck to the open source version)
It is my aim to make a PR for an other example project once everything runs smoothly. Currently, I am facing one single last problem...
Besides others, I am defining python project dependencies for:
- scipy (0.18.1) and
- scikit-learn (0.18).
scikit-learn depends on scipy, but does not mention this dependency in any metadata explicitly. As a result, If one intends to install scikit-learn (even in a fresh environment), the installation procedure obviously fails due to the missing required scipy library (see the attached log1.txt file for this scenario from within pygradle).
From the same log1.txt file, I deduce that dependencies are installed using alphabetical order ( I have tested some permutations in the build.gradle file without succedd). Since "scikit-learn" < "scipy" (on ASCII level), the build step "installPythonRequirements" will always fail when resolving dependencies in alphabetical order.
As a work around, one can
- remove the dependency for scikit-learn for the first build run,
- wait for the import error in the source code,
- re-add the dependency for scikit-learn, and
- run the build step again.
Since the same virtualenv as for the first run is used, scipy is correctly detected. This, in turn, results in a successful installation of scikit-learn s.t. the build finally succeeds.
Therefore my question (in this scenario so one can image multiple others): How to define or configure that scikit-learn is installed after scipy?
Thanks! André
Hi @busche, do you know if the scipy or scikit-learn maintainers are aware of this missing metadata (i.e., the dependency declaration)? If that was properly modeled, this wouldn't be an issue. I think @zvezdan might have a work around for you that we use with other internal dependencies that have this type of bad or missing metadata.
@sholsapp This is an issue because these scientific Python packages are not pure Python packages but rather Python/C/Fortran libraries that unfortunately depend on each other's C shared libraries to build. Standard Python packages don't have this issue and can be installed out-of-order from their dependencies.
We do have a flexibility to change this in pygradle, though, very easily. I'm currently busy with something and want to test it before replying to @busche. I'll have something here this evening (Pacific time).
Hi @zvezdan,
do you have any updates on this issue? Maybe I can lend a hand in testing something?
Best, André
@busche You can put the dependencies in the order you want to enforce in build.gradle:
dependencies {
// ...
python 'pypi:scipy:0.18.1'
// ...
python 'pypi:scikit-learn:0.18'
// ...
}
Then add (probably before this section):
project.tasks.findByName('installPythonRequirements').sorted = false
If you want to avoid depending on a specific internal name, you can use this instead:
import com.linkedin.gradle.python.plugin.PythonPlugin
project.tasks.findByName(PythonPlugin.TASK_INSTALL_PYTHON_REQS).sorted = false
That will avoid sorting the dependencies before the install and install them in the order they appear in the direct dependencies closure.
Short feedback from my side: It works. Within the next days, I will have a small write-up on this.
I've added the example as PR #87 - I hope this helps the others to get started.