cog
cog copied to clipboard
cog: add support for "environment" to set environment variables in the generated Dockerfile
ℹ️ Docs preview: https://github.com/hangtwenty/cog/blob/cog-environment-variables/docs/yaml.md#environment
Change 1/2: support for environment (ENV)
Closes issue #291, "Environment variables in cog.yaml." In Dockerfile parlance, it's support for ENV. This supersedes my the previous PR, #361.
This enables adding environment to build in cog.yaml:
build:
environment:
- EXAMPLE=/src/example
- DEBUG=1
And those become ENV directives in the Dockerfile.
Change 2/2: A single default environment variable to configure caching for PyTorch and some other libraries (XDG_CACHE_HOME)
One of the libraries used most often with cog is PyTorch. You don't have to do configure anything for it, and now the caching should "just work" in most cases. Likewise for other popular libraries.
Variables you DON'T need to set
You don't need to set any of the following because the sane default for $XDG_CACHE_HOME will take care of it. (Because it's part of a standard, several libraries use it for their defaults.)
- pytorch: You don't need to set
PYTORCH_TRANSFORMERS_CACHE— docs - Hugging Face: You don't need to set
TRANSFORMERS_CACHE— docs- You shouldn't have to set
PYTORCH_PRETRAINED_BERT_CACHE— docs
- You shouldn't have to set
- pip: no variables needed! Before this PR,
cogalready configured the pip/apt-get cache directories
You don't need to set TORCH_HOME
If TORCH_HOME is set, that'll take precedence over XDG_CACHE_HOME — see pytorch docs, huggingface docs. So, if you set TORCH_HOME to something else, just be sure to set it within /src/, like /src/.cache-torch. That way, it will get cached across runs.
Updated the PR description, so that it can serve as the interim docs/FAQ about the feature.
Nice! It looks like this is ready to go implementation-wise and just needs some documentation.
@hangtwenty is that something you can work on?
@zeke Sure, I'll make some time this week.
Hmm, it's getting an ❌ for the DCO check, but I just did git commit --amend -s and re-pushed, and it did not resolve it. What do I do about that?
DCO check
I approved it manually. 🍏
Great work, @hangtwenty!
I'll defer to @bfirsh and @andreasjansson to make a call on whether this is ready to ship.
@hangtwenty Thanks for all your help getting this together. I think what is making this hard to review is there are some unrelated changes in this PR that need extra review -- the extra cache environment variables, all the extra docs about how to do caching, etc.
Perhaps we could pull out just the change that adds environment to cog.yaml and have some really minimal documentation for just that thing, then we could have separate PRs for the other changes.
No worries if you haven't got time -- we want to add this too so maybe we'll have the chance to slim this down.
@bfirsh I see. I'm kinda busy at the moment so I may not have time to separate to two PRs soon.
I initially considered doing it them separately, but as I thought about it more, I thought end-users would have a better experience if these two changes were released at the same time. Sure, they are two separate concerns from the implementation perspective, but I thought that, from an end-user perspective, the changes serve to answer, "How can I use/reuse a cache directory [with PyTorch/etc.]?"
The goal was to save end-users from some unnecessary re-work. As an end-user I prefer "batteries included but removable" rather than "batteries purchased separately," i.e. light amount of convention/sane-defaults shipped together with ability to override configuration. In this case the "sane default" is just one default environment variable, $XDG_CACHE_HOME, defaulting to a value that makes sense in the context of cog (i.e. within /src/ so by default PyTorch will reuse a cache folder).
Mere suggestions, though.
No worries if you haven't got time -- we want to add this too so maybe we'll have the chance to slim this down.
Feel free, of course! Just confirmed "allow edits by maintainers" is enabled.
@hangtwenty sorry this got stuck. We'd still love to include this if you ever find the time again for a slimmed down PR. Hope you're well. :)
