hatch icon indicating copy to clipboard operation
hatch copied to clipboard

Separate installation of dependencies?

Open wbecker opened this issue 1 year ago • 7 comments

I am building a Dockerfile, using Python, with hatch as a build manager. I want to install dependencies separately from compiling py => pyc files, so that when my code changes I can cache the dependencies (which can take a while to redownload and install).

I'm not sure how I can build the dependencies in hatch as a separate step. (Maybe I can't read the hatch docs, but it doesn't seem to mention it anywhere)

My dockerfile looks something like (simplified)

FROM python
RUN python -m pip install hatch
COPY . /
RUN hatch build

I'd like to change it to

FROM python
RUN python -m pip install hatch
COPY pyproject.toml / 
RUN hatch build-deps    # this step should be cached if a different file is changed
COPY . /
RUN hatch build         # this step shouldn't re-install deps again

Is this possible - if so, maybe this page should be updated to mention it - https://hatch.pypa.io/latest/config/dependency/ ?

wbecker avatar Sep 11 '23 14:09 wbecker

Under the hood Hatch uses pip for dependency management currently so to persist the cache you would use its mechanisms: https://pip.pypa.io/en/stable/topics/caching/#where-is-the-cache-stored

ofek avatar Sep 11 '23 16:09 ofek

Thanks - for future people - I found these 3 issues on pip:

  • https://github.com/pypa/pip/issues/11440
  • https://github.com/pypa/pip/issues/8049
  • https://github.com/pypa/pip/issues/11584

The solution given (for now until 11440 happens) is to read the requirements out of the toml file and pipe them into pip. It's a bit ugly but it seems to work...

FROM python
RUN python -m pip install hatch
COPY pyproject.toml / 

RUN pip install toml && python -c 'import toml; c = toml.load("pyproject.toml"); print("\n".join(c["build-system"]["requires"]))' | pip install -r /dev/stdin

RUN pip install toml && python -c 'import toml; c = toml.load("pyproject.toml"); print("\n".join(c["project"]["dependencies"]))' | pip install -r /dev/stdin

COPY . /
RUN hatch build         # this step shouldn't re-install deps again

Hatch then uses these already downloaded files and doesn't grab them again, so if you modify only your source it should be faster.

wbecker avatar Sep 12 '23 10:09 wbecker

I let Hatch build the environment fully, then use pip uninstall to remove the application/library. This leaves the environment fully populated with all dependencies not only cached but already installed, which can save a lot of time if the installation of any of them includes compiling native code.

There was some discussion of this idea here: https://github.com/pypa/hatch/discussions/376

kpfleming avatar Oct 24 '23 10:10 kpfleming

To install dependencies of an environment you can add a script with an empty body: hatch.toml

[envs.some-environment.scripts]
update-deps = ""

Then use it in Dockerfile like so:

# the next step can be cached
RUN hatch run update-deps

Ae-Mc avatar Mar 14 '24 14:03 Ae-Mc

Won't that still install the project code? The idea of this issue is to have a way to install the dependencies without installing the project.

kpfleming avatar Mar 15 '24 11:03 kpfleming

Maybe, I'm not sure. Sorry if I was wrong

Ae-Mc avatar Mar 15 '24 21:03 Ae-Mc

The following seems to work fine for caching dependencies inside a docker file: RUN hatch dep show requirements | xargs pip install

cdleonard avatar Aug 02 '24 12:08 cdleonard