cog icon indicating copy to clipboard operation
cog copied to clipboard

cog: add support for "environment" to set environment variables in the generated Dockerfile

Open floer32 opened this issue 3 years ago • 9 comments
trafficstars

ℹ️ Docs preview: https://github.com/hangtwenty/cog/blob/cog-environment-variables/docs/yaml.md#environment


Change 1/2: support for environment (ENV)

Closes issue #291, "Environment variables in cog.yaml." In Dockerfile parlance, it's support for ENV. This supersedes my the previous PR, #361.

This enables adding environment to build in cog.yaml:

build:
  environment:
    - EXAMPLE=/src/example
    - DEBUG=1

And those become ENV directives in the Dockerfile.

Change 2/2: A single default environment variable to configure caching for PyTorch and some other libraries (XDG_CACHE_HOME)

One of the libraries used most often with cog is PyTorch. You don't have to do configure anything for it, and now the caching should "just work" in most cases. Likewise for other popular libraries.

Variables you DON'T need to set

You don't need to set any of the following because the sane default for $XDG_CACHE_HOME will take care of it. (Because it's part of a standard, several libraries use it for their defaults.)

  • pytorch: You don't need to set PYTORCH_TRANSFORMERS_CACHEdocs
  • Hugging Face: You don't need to set TRANSFORMERS_CACHEdocs
    • You shouldn't have to set PYTORCH_PRETRAINED_BERT_CACHEdocs
  • pip: no variables needed! Before this PR, cog already configured the pip/apt-get cache directories

You don't need to set TORCH_HOME

If TORCH_HOME is set, that'll take precedence over XDG_CACHE_HOME — see pytorch docs, huggingface docs. So, if you set TORCH_HOME to something else, just be sure to set it within /src/, like /src/.cache-torch. That way, it will get cached across runs.

floer32 avatar Jan 26 '22 05:01 floer32

Updated the PR description, so that it can serve as the interim docs/FAQ about the feature.

floer32 avatar Jan 27 '22 21:01 floer32

Nice! It looks like this is ready to go implementation-wise and just needs some documentation.

@hangtwenty is that something you can work on?

zeke avatar Mar 14 '22 20:03 zeke

@zeke Sure, I'll make some time this week.

floer32 avatar Mar 15 '22 01:03 floer32

Hmm, it's getting an ❌ for the DCO check, but I just did git commit --amend -s and re-pushed, and it did not resolve it. What do I do about that?

floer32 avatar Mar 16 '22 20:03 floer32

DCO check

I approved it manually. 🍏

zeke avatar Mar 16 '22 21:03 zeke

Great work, @hangtwenty!

I'll defer to @bfirsh and @andreasjansson to make a call on whether this is ready to ship.

zeke avatar Mar 16 '22 21:03 zeke

@hangtwenty Thanks for all your help getting this together. I think what is making this hard to review is there are some unrelated changes in this PR that need extra review -- the extra cache environment variables, all the extra docs about how to do caching, etc.

Perhaps we could pull out just the change that adds environment to cog.yaml and have some really minimal documentation for just that thing, then we could have separate PRs for the other changes.

No worries if you haven't got time -- we want to add this too so maybe we'll have the chance to slim this down.

bfirsh avatar Apr 19 '22 13:04 bfirsh

@bfirsh I see. I'm kinda busy at the moment so I may not have time to separate to two PRs soon.

I initially considered doing it them separately, but as I thought about it more, I thought end-users would have a better experience if these two changes were released at the same time. Sure, they are two separate concerns from the implementation perspective, but I thought that, from an end-user perspective, the changes serve to answer, "How can I use/reuse a cache directory [with PyTorch/etc.]?"

The goal was to save end-users from some unnecessary re-work. As an end-user I prefer "batteries included but removable" rather than "batteries purchased separately," i.e. light amount of convention/sane-defaults shipped together with ability to override configuration. In this case the "sane default" is just one default environment variable, $XDG_CACHE_HOME, defaulting to a value that makes sense in the context of cog (i.e. within /src/ so by default PyTorch will reuse a cache folder).

Mere suggestions, though.

floer32 avatar Apr 21 '22 20:04 floer32

No worries if you haven't got time -- we want to add this too so maybe we'll have the chance to slim this down.

Feel free, of course! Just confirmed "allow edits by maintainers" is enabled.

floer32 avatar Apr 22 '22 19:04 floer32

@hangtwenty sorry this got stuck. We'd still love to include this if you ever find the time again for a slimmed down PR. Hope you're well. :)

bfirsh avatar Aug 26 '22 02:08 bfirsh