cookiecutter-data-science
cookiecutter-data-science copied to clipboard
Make commands operate in current shell environment (not necessarily in the project Python environment)
I exploring further the tool and I have two comments:
If the default python of the system is version 3.x, and a new project with version 2 is chosen, then the environment testing fails. The reason is that it tries to use the global python
which is version 3.x.
Second issue, after creating the environment (assuming no problem with the python version), then make requirements
uses the global pip
and thus installs everything into the root of conda. I either missed something, or there's some wrong in the flow. It seems not to be possible to activate a conda environment from within the Makefile
. So, basically, the user has to activate the environment before runing make requirements
.
What do you think?
In this fork I managed to tackle the problem.
It is not 100% related, but note, I am assuming availability of conda
.
Yep, you've picked out what is consistently a sticky issue with environments generally. It's not possible to activate a virtual environment within the Makefile and then will have that environment available to the user at the command line. We currently take the approach of alerting the user with a message that they need to manually run a command.
Feel free to reopen if you've got suggestions that are clean and will work consistently across platforms.
I believe my suggestion as I linked above is by far superior in comparison to a warning. With the current implementation the following two steps:
make create_environment
make requirements
would install in the requirements in the root of conda. This is very bad from the user's perspective. Basically, in the Makefile, whenever there's a call depending on the environment (this applies also to awscli
) the environment should be activated and the line should not be broken (as each line is a subshell inside the Makefile).
@pjbull I would be happy to understand what is the downside/wrong with my approach in your mind.
No real downside except that for nearly all use cases a user will need to activate the environment as part of their workflow anyway to do development. I've also got the minor aesthetic quibble that it's repeated all over and as we maintain the project, we'll have to keep track of which commands need the environment active and which ones don't. From that perspective, I almost like the idea of a dependency that checks for the activated environment and bails if it's not active instead with a warning.
Other thing we'll want to make sure is that it works for both conda
and virtualenv
and works whether or not the shell you run make
from already has the environment activated or not.
Thoughts?
The bare minimum is to exit the make process if the virtual environment has to be activated in the shell but it is not. Otherwise, the issue I mentioned with modifying the root can easily happen. This, by the way, in my mind, is much more dangerous.
Otherwise, enclosing each statement in the Makefile
with virtual environment activation (supporting both conda
and virtualenv
) is definitely a pain. Maybe a utility script can help, but first I don't have an idea how to implement such a script and secondly, I imagine this can cause issues in term of being platform agnostic.
At least for the time being, I will try to stick with enforcing conda
and activating the environment in the Makefile
when ever needed. The second part should not be too hard.
Agreed people accidentally overwriting their base site-packages is annoying (maybe not dangerous). Reopening this to see if someone has time to put together a PR that has a script (or preferably just a make command) to check if the environment is active.
I added in the Makefile
a wrapper for calling python related calls from within the environment:
define execute_in_env
source activate $(REPO_NAME); \
$1
endef
In turn, here's a sample target:
test_env_run:
$(call execute_in_env, python --version)
I didn't try yet to figure out how to check whether conda-env
or virtualenv
(or maybe no environment at all?) is used.
For reference: I reused code from here
@drorata original link doesn't work. I think he meant this: https://github.com/drorata/ds-cookiecutter/blob/master/%7B%7B%20cookiecutter.short_prj_name%20%7D%7D/Makefile
Thanks for providing this cookiecutter template!
Should requirements get installed into the correct conda environment when doing this?
make create_environment
conda activate my-env
make requirements
It seems like I am still getting global installs like that.
Oh, it seems like I had pyenv
activated in addition to conda
and that messed it up. Works after getting rid of pyenv
. 👍