hatch
hatch copied to clipboard
META: Hatch documentation upgrade
Following the discussion here, we discussed with @ofek @dahandv upgrading the hatch docs to include more how-to and tutorial style elements to help users get started with hatch.
This also related to this issue opened by @pfmoore about using the Diátaxis framework.
In this issue we can iterate around what the structure of tutorials vs how-tos should look like and what we wish to create / develop further to help hatch users. I'll attempt to track comments below and update the main outline here as the discussion progresses.
I'll also try to here and there scan issues and discussions to identify pain points and get users involved in the upgraded content reviews :) @dahandv
Note that we also are working on tutorials at pyOpenSci which we could link to / use as needed here. Here is a tutorial on publishing to PyPI using hatch.
I'm starting the discussion here but probably can't work out a full outline now. Please add comments about other tutorials / how to's that you'd like to see and i will update this header comment as needed. (or @ofek obviously you can always edit it too!).
Hatch How To's
- [ ] How to create and work with environments using hatch (@lwasser)
Hatch Tutorials
- [ ] Tutorial on Python version management with hatch - @dahandv
@lwasser hey! Sorry for being off line for so long! I'm working on a reduced guide for diataxis so new contributions can get up to speed when they wish to contribute to any kind of documentation! (Instead of letting them figure this out themselves which can be a turnoff for some; the diataxis official guide is wordy and repetitive IMO!) I will link this here today (I hope!) your review and comments will be appreciated ❤️
looking forward to seeing what you pull together @DahanDv !
It would be good to document advice on usage within Docker (this was requested in the past).
Is just installing hatch
via curl and running hatch --version
expected to add over 400MB in disk usage? If python is available, one can install pipx
to get hatch
which is less heavy, but uv
seems to be pulled with these two install methods too regardless if you'd use it? (IIRC in one case it was about 30MB while the other had about 90MB of data related to uv
).
I had seen in the docs a brief note/admonition about standalone/installers not being able to detect/use an existing python install, thus pulling in a standalone version of python? (I had attempted to avoid this with a config.toml
, but it didn't seem to help reduce weight)
If 150-400MB is to be expected, it might be worthwhile to raise some awareness there. At least with an endorsed approach for using hatch
within a container, that expectation of disk weight would be clearer :)
i am not sure if i can help here or not but chiming in. i just played with this quickly locally. when i created a docker container with python / pip in it it automatically increased the container size but about 330mb.
my question: if python is not installed on a user's system and you install hatch, will it by default now try to install python now that it supports uv?
i wonder if this should be another issue where folks chime in but also i wonder if anyone has worked with a docker container with some version of python already installed to see if there is a difference in the size of the container when running hatch --version (as a way to potentially tease out the need for python to be installed and how is't setup most efficiently in a container vs. hatch's default behavior).
please excuse this comment if it's totally off base. it does seem like docs around this would be useful!
I will respond to a few comments at the same time:
- If you download the Hatch binary then on the first run it will download a Python distribution and install itself from PyPI. If this is undesirable then manage Python and install Hatch manually.
- UV is only used for virtual environment creation and dependency installation, entirely a runtime thing when using environments that have it enabled so for example
hatch --version
would not invoke UV at all. - It would be helpful to know where exactly the disk space is coming from. Perhaps the Dockerfile isn't cleaning up pip caches.
@ofek to clarify
the example above referred to this issue comment.
which had this docker setup:
$ docker run -it ubuntu:22.04 bash
$ apt update && apt install -y curl
$ curl -sSfL https://github.com/pypa/hatch/releases/download/hatch-v1.10.0/hatch-1.10.0-x86_64-unknown-linux-gnu.tar.gz | tar -xz
$ mv hatch-1.10.0-x86_64-unknown-linux-gnu/usr/local/bin/hatch
$ du -shx /
144M /
$ hatch --version
Hatch, version 1.10.0
$ du -shx /
558M /
in this case a user is
- creating a "blank slate" docker environment with ubuntu only from what I can see.
- downloading hatch via curl.
So nothing is run - yet. Then the hatch binaries are moved into a new location so hatch can be called.
To me it makes sense based on what you wrote above that in this specific case, when you run hatch --version
it will first download python. And that Python download accounts for the increase in size of the container.
The alternative approach would be for someone to create a docker container that first installs python or inherits from another container on dockerhub that contains python.
Is that interpretation correct? and if it is, would it make sense to create a small how to (or add doc enhancements elsewhere)? i'm happy to help create a very basic example of this that others could enhance / build off of.
here is a repro example. i definitely saw it install python and hatch when i ran hatch --version
. NOTE: i'm on a mac so using a different release distro below compared to the example referred to above! But a small cleanup step did reduce the size.
$docker run -it ubuntu:22.04 bash
root@bd1bf5df743c:/# apt update && apt install -y curl
root@bd1bf5df743c:/# mv hatch-1.10.0-aarch64-unknown-linux-gnu /usr/local/bin/hatch
root@bd1bf5df743c:/# du -shx
131M .
root@bd1bf5df743c:/# hatch --version
Hatch, version 1.10.0
root@bd1bf5df743c:/# du -shx
361M .
root@bd1bf5df743c:/# rm -rf /var/lib/apt/lists/*
root@bd1bf5df743c:/# du -shx
316M .
Yes that is actually expected as I mention in my first bullet point. Hatch binaries are built with PyApp and bootstraps itself on the first run. If you already have Python available and want to cut down on disk space then I would recommend installing manually.
I might be able to shave some MBs off given a new release of the binaries and docs on enabling the option.
fantastic. Ofek would a small "how to" or tutorial about creating a docker environment be useful in the docs? i am not a docker expert but i could atleast capture the information here for folks to use.
maybe @polarathene (if you are up for it) could review and provide input as well?
Yes that would be quite helpful! I wouldn't have time to add that new feature until after PyCon though.
I looked into it a bit, here's my findings, hope it's helpful 👍
FWIW, keeping it simple and focused/familiar for most Docker users (that is those less experienced) is probably best. I wouldn't stress too much on size as you can see in the examples below you won't save too much with the added effort, but it's possible 👍
If you write something up and contribute a PR feel free to ping me and I'll try provide a review if I have the time :)
TL;DR:
- Some distros do package
hatch
already, but they're only providing 1.9 right now. Might take a while before 1.10 is available to benefit fromuv
. These should be the lightest install option when available. -
pipx install
is fairly simple and easy to do via any distro as an alternative, with the benefit of the latesthatch
+uv
(bundled). You'll need to either installuv
viapipx
to get it available easily, or alternatively configurehatch
to provide it per environment, otherwise the symlink (ln -s
command below) approach works easy enough (most users may be more comfortable with just adjustingPATH
).
There's also the route of having a Dockerfile
added to this repo, and optionally a GH Actions workflow that automates publishing images to DockerHub / GHCR with the release CI. Most users would likely be happy using a base image with hatch
, unless they need to install system packages and have a particular preference (often this is ubuntu or debian for the familiar apt
command they'll come across online on sites like StackExchange/StackOverflow).
NOTE: du -shx
reports the total size of the location in MiB (1024^2
, not MB: 1000^2
, which would be -sx --si
_). So the M
value in output is MiB.
-
-x
excludes any other potential filesystem boundaries (unlikely in this case). - If hardlinks are present (like with
uv
) the content will only be counted once. Thus two separate venv folders with hardlinks touv
package store (cache) would not report duplicates, while you can query individual venv folder in isolation it does not represent that some data is shared (hardlinks are to an inode, unlike a symlink there is no specific location as owner).
Install approaches
Package Manager (122 MiB)
$ docker run --rm -it quay.io/fedora/fedora-minimal:41 bash
$ du -shx /
126M /
$ dnf5 install -y --setopt=install_weak_deps=0 hatch
Transaction Summary:
Installing: 75 packages
Upgrading: 5 packages
Replacing: 5 packages
Total size of inbound packages is 33 MiB. Need to download 33 MiB.
After this operation 122 MiB will be used (install 124 MiB, remove 2 MiB).
# Extra is from package manager cache:
$ du -shx /
297M /
# Clean up package manager cache:
$ dnf5 clean all
Removed 12 files, 7 directories. 0 errors occurred.
# Thus total 122 MiB added weight:
$ du -shx /
248M /
Standalone installer (4MiB installs to 400+ MiB)
$ docker run --rm -it quay.io/fedora/fedora-minimal:41 bash
# Fedora image already has curl, just needs tar + gzip to extract:
$ dnf5 install -y tar gzip && dnf5 clean all
# As the tar.gz contains only a single file, we can write the output to the preferred location directly:
$ curl -sSfL https://github.com/pypa/hatch/releases/download/hatch-v1.10.0/hatch-1.10.0-x86_64-unknown-linux-gnu.tar.gz \
| tar -xzO > /usr/local/bin/hatch && chmod +x /usr/local/bin/hatch
# Before triggering install:
$ du -shx /
131M /
# 410+ MiB added weight from install:
$ hatch --version && du -shx /
544M /
Now as a Dockerfile
, build the image for better insight into layer for hatch --version
via the dive
CLI tool to see where all that weight is coming from:
FROM quay.io/fedora/fedora-minimal:41
RUN dnf5 install -y tar gzip && dnf5 clean all
RUN curl -sSfL https://github.com/pypa/hatch/releases/download/hatch-v1.10.0/hatch-1.10.0-x86_64-unknown-linux-gnu.tar.gz \
| tar -xzO > /usr/local/bin/hatch && chmod +x /usr/local/bin/hatch
RUN hatch --version
# In dir with `Dockerfile` above:
docker build --tag local/hatch .
docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock wagoodman/dive local/hatch
# Overview of the biggest sources of that weight:
61MB => /root/.cache/pyapp/distributions/14656550572188801628
32MB => /root/.cache/pyapp/uv
229MB => /root/.cache/pyapp/distributions/_14656550572188801628/python/lib/
- 189 MB => libpython3.12.so.1.0
- 26 MB => python3.12
91MB => /root/.cache/uv
Considering that's all in the /root/.cache
dir and nowhere else, it's not obvious what is safe to remove without breaking any assumptions from hatch
?
- Presumably
uv
is still optional and could be removed if not needed. Not sure why there are two instances there?-
/root/.cache/uv
(91MB) looks like it's a python package foruv
? -
/root/.cache/pyapp/uv
(32MB) is the actualuv
binary.
-
- ~~Presumably
/root/.cache/pyapp/distributions/_14656550572188801628/python
can be removed~~-
EDIT: No,
hatch
breaks, as/root/.local/share/pyapp/hatch/14656550572188801628/1.10.0/bin/hatch
is reliant upon it. - If an existing python environment is present (when running
hatch --version
as above), it is disregarded and thepyapp
python build is still pulled in, thishatch
is fully self-contained... Thus likely the same for theuv
dependency?
-
EDIT: No,
pipx install hatch
(165 MiB, 48 MiB for pipx
, 117 MiB for hatch
+ bundled uv
)
$ docker run --rm -it quay.io/fedora/fedora-minimal:41 bash
# Install pipx with python3:
$ dnf5 install -y pipx && dnf5 clean all
Transaction Summary:
Installing: 17 packages
Total size of inbound packages is 13 MiB. Need to download 13 MiB.
After this operation 48 MiB will be used (install 48 MiB, remove 0 B).
# Pre-install weight (ignoring pipx 48 MiB):
$ du -shx /
184M /
$ pipx install hatch uv
$ export PATH="${PATH}:/root/.local/bin"
# Post-install weight:
$ du -shx /
365M /
$ hatch --version
Hatch, version 1.10.0
$ uv --version
uv 0.1.42
# No change, huzzah!
$ du -shx /
365M /
- 35 MiB of that is from
/root/.cache/
, specificallypip
. The entire cache folder at this point can be emptied. Usingpipx install
it brings in it's own copy ofpip
, thus no value from addingpip
via the package manager. - 31 MiB belongs to an internal copy of
uv
thathatch
has bundled in it's virtual environment at/root/.local/share/pipx/venvs/hatch/bin/uv
. This allowshatch
to useuv
when configured (eg:installer = "uv"
inhatch.toml
), but is otherwise not available to you, even withinhatch run
/hatch shell
(souv pip list
isn't available to inspect whatuv
has installed implicitly viahatch
)
You could of course make uv
available a few ways:
- Add a symlink:
ln -s /root/.local/share/pipx/venvs/hatch/bin/uv /usr/local/bin/uv
- Update your
PATH
ENVexport PATH="${PATH}:/root/.local/share/pipx/venvs/hatch/bin/uv" (_**NOTE:** within a
hatchenvironment this location is available via
HATCH_UV` ENV_) - Configure
pip
oruv
to aliasHATCH_UV
environment variable via "Extra Scripts" (as shown in the docs). But this would need to be per environment AFAIK? -
Install
uv
again viapipx
if you don't mind the extra space sincepipx
does not de-duplicate via hardlinks from what I can tell.
FROM quay.io/fedora/fedora-minimal:41
RUN dnf5 install -y pipx && dnf5 clean all
# Hatch bundles uv:
RUN pipx install hatch && rm -rf /root/.cache/
# Effectively what `pipx ensurepath` accomplishes to make the hatch command available:
ENV PATH="${PATH}:/root/.local/bin"
# One of many ways to use the internal uv installed with hatch:
RUN ln -s /root/.local/share/pipx/venvs/hatch/bin/uv /usr/local/bin/uv
# Verify both commands work:
RUN hatch --version && uv --version
Advanced: FROM scratch
multi-stage (roughly 210 MiB total image size)
- While I haven't tried this, the standalone binary installer possibly can be run without a base image, but it's still the heavy-weight choice.
- Fedora image thanks to
dnf
has a feature that can create a new base image with only the minimal packages you need, which we've established ishatch
(197 MiB total base image) orpipx
(109 MiB total base image + 117 MiB afterpip install hatch
).zypper
(openSUSE) also has this feature wherepython311-hatch
base will be 179 MiB andpython311-pipx
87 MiB (+117 MiB afterpipx install hatch
). Any extra commands can still be run in that new root location viachroot
if needed, such as runningpipx install hatch
, then you can switch to the next stage withscratch
andCOPY
that over for a minimal image size.
# syntax=docker.io/docker/dockerfile:1
FROM quay.io/fedora/fedora-minimal:41 AS base-stage
# The <<EOF (start) and later EOF (end) markers are HereDoc syntax
# Allows for a RUN directive to more nicely run multiple commands in a single layer
RUN <<EOF
dnf5 --installroot /rootfs --use-host-config --setopt=install_weak_deps=0 install -y pipx
dnf5 --installroot /rootfs --use-host-config --setopt=install_weak_deps=0 clean all
# This works since bash was implicitly installed into the new root fs
# NOTE: DNF was not included, so it is not available once we switch via chroot.
# For DNS lookups like `pipx install` needs, we'll also need to provide `/etc/resolv.conf`
cp /etc/resolv.conf /rootfs/etc/resolv.conf
# chroot is a bit awkward in a Dockerfile, using SHELL directive or after the COPY on scratch
# may be more convenient?
chroot /rootfs bash -c 'pipx install hatch && rm -rf /root/.cache/'
chroot /rootfs ln -s /root/.local/share/pipx/venvs/hatch/bin/uv /usr/local/bin/uv
EOF
FROM scratch
ENV PATH="${PATH}:/root/.local/bin"
COPY --link --from=base-stage /rootfs /
RUN hatch --version && uv --version
Throughout my examples I've used quay.io/fedora/fedora-minimal:41
, this is a beta image where dnf5
is built-in. Previously on minimal images it'd be microdnf
, but once Fedora 41 is released both the minimal image and regular fedora (eg: fedora:41
) will have dnf5 as the usual dnf
command (finally!). fedora-minimal
has a smaller base, but it does make some compromises (for example try running btop
, it needs a little extra nudge on your part), I think the UX (at least interactively?) goes down a bit, so I'd generally suggest the regular fedora
images, and it should make little difference with this --installroot
approach.
Like Fedora, the openSUSE TumbleWeed image is still on hatch 1.9.x
, thus both hatch
packages are 30 MiB shy of what they'd actually be with uv
involved. When that lands you'll get a more minimal/simpler scratch
, but honestly the size isn't that big of a win here:
# syntax=docker.io/docker/dockerfile:1
FROM opensuse/tumbleweed AS base-stage
RUN <<EOF
zypper --releasever tumbleweed --installroot /rootfs --gpg-auto-import-keys refresh
zypper --releasever tumbleweed --installroot /rootfs --non-interactive install --download-in-advance --no-recommends python311-pipx
# Cleanup doesn't make a difference in this case (zypper keeps most cache on the main root), but this is how you'd do it:
# NOTE: If you care about this base-stage image layers you could clear the main root cache without the `--releasever --installroot` args
# zypper --releasever tumbleweed --installroot /rootfs-h --non-interactive clean --all
# No need to worry about the /etc/resolv.conf if you're not doing any network stuff via chroot
# At runtime of the container Docker will replace it to manage networking itself.
EOF
FROM scratch
COPY --link --from=base-stage /rootfs /
RUN hatch --version
NOTE: If you try to do the pipx
install with the opensuse image you'll find that it fails with the rm
and ln
commands not existing. Those are packages that weren't needed for pipx
, but are required to do those extra steps so you'd need to add them. Fedora on the other hand still installs those basic utility commands.
Alpine (roughly 180 MiB total image size)
Smallest by about 30-40 MiB, fairly simple but Alpine with musl
does have some caveats to be mindful of.
# syntax=docker.io/docker/dockerfile:1
FROM alpine
RUN <<EOF
apk add --no-cache pipx
pipx install hatch && rm -rf /root/.cache
ln -s /root/.local/share/pipx/venvs/hatch/bin/uv /usr/local/bin/uv
EOF
ENV PATH="${PATH}:/root/.local/bin"
RUN hatch --version && uv --version
For minimizing size
- ❌
pyapp
(Standalone installer viacurl
) => Perhaps there is something you can remove from above, but it's not clear what will break (or change over time like the addition ofuv
). - ✅ Package manager => Probably your best option is when it's supported like Fedora does (although there is the disadvantage of version lag, as you cannot get the
1.10.0
release yet to enjoyuv
). As can be seen above the size is much less and withuv
it should only go up by about roughly 30MB (another issue has the package maintainer discussing it, where they might not makeuv
a weak dependency.. which may enforce thatuv
to be bundled even if you don't need it). - ✅
pipx
=> This is also reasonably lightweight (approx 150MiB) ~~and also requires installinguv
separately tohatch
(_ which you can do through the same tool for 30MiB more_)~~ (EDIT:hatch
bundlesuv
, you can technically use it directly too_). So slightly more weight, but much more broadly accessible 👍 - ✅**
FROM scratch
** (210 MiB for the whole image) => The smallest of all all, but a bit more involved. You can achieve similar size with a much simpleralpine
+pipx
equivalent without the--installroot
multi-stage trick (183 MiB).. However Alpine beingmusl
based has some drawbacks (you'll find some articles specifically about issues with Python, but there can be quite a few gotchas), thus I generally discourage it, especially since glibc based distros like fedora and suse can compete reasonably close size wise (210 MiB) with a few extra lines, but much better performance and compatibility.
Thank you for the fantastic writeup!
As of https://github.com/pypa/hatch/releases/tag/hatch-v1.11.0, the binaries pull down distributions that already have Hatch installed which is about as small as I can make that. This is what the official GitHub action to install Hatch will use when I have time to do so.
There is also a new self cache
command so after installation you would want to run hatch self cache dist --remove
and now all that will exist will be the distribution with Hatch that is tied to the binary. The following is an example:
❯ docker run --rm -it ubuntu bash
root@c8f3aacf6229:/# apt update && apt install -y --no-install-recommends curl ca-certificates
root@c8f3aacf6229:/# du -shx
127M .
root@c8f3aacf6229:/# curl -LO https://github.com/pypa/hatch/releases/latest/download/hatch-x86_64-unknown-linux-gnu.tar.gz
root@c8f3aacf6229:/# tar xzf hatch-x86_64-unknown-linux-gnu.tar.gz
root@c8f3aacf6229:/# ./hatch self restore
root@c8f3aacf6229:/# rm hatch-x86_64-unknown-linux-gnu.tar.gz
root@c8f3aacf6229:/# ./hatch self cache dist -r
root@c8f3aacf6229:/# du -shx
470M .
Actually forget what I said please, I'm about to reduce that substantially.
Done!
amazing!! ofek, with pycon travel coming up i won't be able to start a tutorial / how to until after i'm back! but also @polarathene you've provided an INCREDIBLE amount of information above and i suspect / know :) that you know a lot more about this topic than i do. would you like to start a tutorial and i can perhaps contribute? or would you like for me to start / try my best to reflect what you have found and then you can review/ contribute / add that way?
it just seems to me that there is so much information in this thread now, that we should capture it and turn it into a documentation page for others to discover!
ofek that is a considerable reduction in image size!! so so awesome!!
Cheers for the improvement @ofek ! 🥳 (EDIT: It seems there are some gotchas to consider vs a pipx install hatch
)
The below notes are mostly for my benefit to come back to, but sharing with others if helpful. I'll summarize with a TLDR in a follow-up comment.
Collapsed for brevity (click to view)
Layer insights:
hatch self restore
size was is almost equivalent to hatch --version
(near 200MB added), just 4 MB less.
hatch self cache dist --remove
removes 47MB of that added weight from ~/.cache/pyapp
, so you can remove this dir afterwards or leave it with the empty content:
Actual hatch lives as a python script at /root/.local/share/pyapp/hatch/1303662642487178586/1.11.0/python/bin
, but still relies on the binary extracted from curl
AFAIK to run (as even with a local python install to run that script directly it is not happy), so move the installer binary to a location like /usr/local/bin/hatch
👍
.pyc
/ pycache content
The final RUN
layer shows that the hatch --version
command added about 3MB, and that it's due to running python creating various .pyc
cache files like this:
PYTHONPYCACHEPREFIX=/path/to/cache
is meant to allow customizing the cache dir for this content since Python 3.8, but for some reason in my Dockerfile
ENV it wasn't having any effect 🤷♂️ (it does for a system pipx install uv
, so presumably this is due to hatch
using the bundled Python?)
Dockerfile
3 examples, with the first a little bit better documented and avoiding &&
.
# syntax=docker.io/docker/dockerfile:1
FROM fedora:40
RUN <<EOF
# Fedora comes with curl (and tar + gzip, unlike fedora-minimal), nothing to install via dnf
# Grab the latest release for your arch and extract it to /usr/local/bin, then make it executable:
HATCH_URL="https://github.com/pypa/hatch/releases/latest/download/hatch-$(uname -m)-unknown-linux-gnu.tar.gz"
curl -sSfL "${HATCH_URL}" | tar -xzO > /usr/local/bin/hatch
chmod +x /usr/local/bin/hatch
# Finish installing hatch, then remove the redundant PyApp cache:
hatch self restore
hatch self cache dist --remove
EOF
# syntax=docker.io/docker/dockerfile:1
FROM quay.io/fedora/fedora-minimal:41
RUN <<EOF
dnf5 install -y tar gzip && dnf5 clean all
curl -sfSL "https://github.com/pypa/hatch/releases/latest/download/hatch-$(uname -m)-unknown-linux-gnu.tar.gz" \
| tar -xzO > /usr/local/bin/hatch && chmod +x /usr/local/bin/hatch
hatch self restore && hatch self cache dist --remove
EOF
# syntax=docker.io/docker/dockerfile:1
FROM ubuntu:24.04
RUN <<EOF
apt update && apt install -y --no-install-recommends curl ca-certificates && rm -rf /var/lib/apt/lists/*
curl -sfSL "https://github.com/pypa/hatch/releases/latest/download/hatch-$(uname -m)-unknown-linux-gnu.tar.gz" \
| tar -xzO > /usr/local/bin/hatch && chmod +x /usr/local/bin/hatch
hatch self restore && hatch self cache dist --remove
EOF
Technical details if any some of the stuff I did is unfamiliar:
-
syntax=docker.io/docker/dockerfile:1
is a good practice encouraged by docker on their docs. - The
<<EOF
(HereDoc) syntax is something I prefer, it's not difficult to grok once you understand<<EOF
is a start marker and theEOF
the end marker, everything in between is a multi-line input, making it a shell script without&& \
noise to use a singleRUN
layer. Despite this, I do find many are uncomfortable with it, so it might not be ideal for official documentation? 🤷♂️ - The curl URL is also using
$(uname -m)
to get the processor architecture (x86_64
/aarch64
), so you can run the sameDockerfile
on either platform.pipx
is still probably more straight-forward though. - I need to use
chmod +x
due to extracting the compressedhatch
file fromtar.gz
into new name / location (tar -O > /path/filename
). This avoids needing another$(uname -m)
ormv
, and technically corrects permissions (UID is 1001, GID is 127), but writing the contents to a new file (>
) lost the original executable bit (+x
), which needs to be restored.
Total size via du -sx --bytes --si /
(Hatch adds: 140MB /root/.local/share/pyapp/hatch
+ 4MB /usr/local/bin/hatch
):
-
fedora-minimal:41
(263MB /262 414 865
) + 18s image build without cache -
fedora:40
(366MB /365 360 517
) + 22s image build without cache -
ubuntu:24.04
(230MB /229 171 530
) + 45s image build without cache (232MB + 67s forubuntu:22.04
, 236MB + 82s forubuntu:20.04
)
I tend to prefer Fedora as a base as it's faster and better UX with the package manager, but most users may have a better UX with Ubuntu images, especially when they need to add additional system packages (this is sometimes inconvenient with Fedora for proprietary packages like nvidia or certain video codecs IIRC).
Ubuntu has the better image size in this case. It's smaller than fedora-minimal:41
(which I show for size + build speed comparison, but I encourage regular fedora base until fedora-minimal
shares the same dnf
command instead of microdnf
/ dnf5
, which might happen by the final Fedora 41 release).
GH release URLs naming convention change from v1.11.0
The above curl example is for the latest release on GH. If you want to version pin the release file dropped the version prefix since 1.11.0
, so not too relevant going forward (at least hopefully it'll remain consistent from now on, omitting the version prefix is convenient for the latest
approach):
https://github.com/pypa/hatch/releases/latest/download/hatch-x86_64-unknown-linux-gnu.tar.gz
https://github.com/pypa/hatch/releases/download/hatch-v1.11.0/hatch-x86_64-unknown-linux-gnu.tar.gz
https://github.com/pypa/hatch/releases/download/hatch-v1.10.0/hatch-1.10.0-x86_64-unknown-linux-gnu.tar.gz
GH release variants
You also have in addition to the glibc target (-gnu
), a -musl
one. For anyone interested on the glibc linking:
# These must resolve (and they usually should in a glibc focused distro):
$ ldd /usr/local/bin/hatch
linux-vdso.so.1 (0x00007ffc2d9e0000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f74e59d3000)
librt.so.1 => /lib64/librt.so.1 (0x00007f74e6003000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f74e5ffe000)
libm.so.6 => /lib64/libm.so.6 (0x00007f74e58f0000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f74e5ff9000)
libc.so.6 => /lib64/libc.so.6 (0x00007f74e5703000)
/lib64/ld-linux-x86-64.so.2 (0x00007f74e600d000)
# Binary built with Rust 1.78 (latest) and a rather old Ubuntu which suggests `cross-rs` Docker image environment:
$ readelf -p .comment /usr/local/bin/hatch
String dump of section '.comment':
[ 0] GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
[ 35] rustc version 1.78.0 (9b00956e5 2024-04-29)
# Probably built this way for broader compatibility by targeting a low glibc version,
# cargo zigbuild is a more modern approach that can be used instead:
# Command from my comment here: https://github.com/rust-cross/cargo-zigbuild/issues/231#issuecomment-1987845738
$ readelf -W --version-info --dyn-syms /usr/local/bin/hatch \
| grep 'Name: GLIBC' \
| sed -re 's/.*GLIBC_(.+) Flags.*/\1/g' \
| sort -t . -k1,1n -k2,2n | tail -n 1
2.18
# The equivalent for the static linked musl build (old GCC, March 2021):
$ readelf -p .comment /usr/local/bin/hatch
String dump of section '.comment':
[ 0] GCC: (GNU) 9.4.0
[ 11] rustc version 1.78.0 (9b00956e5 2024-04-29)
[ 3d] GCC: (GNU) 9.2.0
- However the GH releases only publish
-musl
forx86_64
, thus if you want to support ARM64 (aarch64
), just use-gnu
. - This also means
-musl
via this install method will only work for Alpine withx86_64
(not that you should be using Alpine for python deployments anyway 🤔 )
Also from 1.11.0
of hatch, there is "dist" variants, which the release page doesn't add clarification to - but extracting these results in 150MiB of content: hatch
+ uv
+ hatchling
and a bundled Python 3.12. Perhaps related to the improvement @ofek mentioned above?
Feedback
So with the above improvement, curl
is a great install option with about 140MB weight 🎉 (100MB for bundled Python + 30MB for bundled uv
)
It'd be neat if you could opt-out of the bundled Python and uv
options if hatch
can instead detect and use the ones available from the system after the boot strapping is done?
hatch
doesn't seem to be aware of it's own bundled distribution though, so I assume that isn't possible?
# Running this command installed
$ hatch python find 3.12
Distribution not installed
# Hatch doesn't consider this as a managed python install, it treats the bundle like a system one?
$ hatch python show
Available
┏━━━━━━━━━━┳━━━━━━━━━┓
┃ Name ┃ Version ┃
┡━━━━━━━━━━╇━━━━━━━━━┩
│ 3.7 │ 3.7.9 │
├──────────┼─────────┤
│ 3.8 │ 3.8.19 │
├──────────┼─────────┤
│ 3.9 │ 3.9.19 │
├──────────┼─────────┤
│ 3.10 │ 3.10.14 │
├──────────┼─────────┤
│ 3.11 │ 3.11.9 │
├──────────┼─────────┤
│ 3.12 │ 3.12.3 │
├──────────┼─────────┤
│ pypy2.7 │ 7.3.15 │
├──────────┼─────────┤
│ pypy3.9 │ 7.3.15 │
├──────────┼─────────┤
│ pypy3.10 │ 7.3.15 │
└──────────┴─────────┘
# Installing it adds another 160MB:
$ hatch python install 3.12
Installed 3.12 @ /root/.local/share/hatch/pythons/3.12
The following directory has been added to your PATH (pending a shell restart):
/root/.local/share/hatch/pythons/3.12/python/bin
$ du -shx /
445M /
Not a major concern, and I may be unfamiliar with a way to configure that, but something to be aware of as if you want to pip install ...
something, AFAIK that requires bringing in another python install (either via distro system package, hatch python install <name>
, or implicitly via pyproject.toml
/ hatch.toml
, etc)... so the above is perhaps not as minimal / convenient as the pipx
approach?
Gotchas
I assume once installing actual python packages or similar activity, another install of Python is going to add to the weight? hatch
isn't able to use the one it's bundled? (EDIT: Documented below, it's possible for virtual env to use the same Python bundled)
Docs for hatch shell
are a bit lacking here:
$ hatch shell --help
Usage: hatch shell [OPTIONS] [ENV_NAME]
Enter a shell within a project's environment.
Options:
--name TEXT
--path TEXT
-h, --help Show this message and exit.
From what I've seen elsewhere --name
refers to a version of Python as listed under the Name
column in hatch python show
?
- The
hatch python find
CLI help also was not that clear when referring to an expected arg ofNAME
btw. Including an example in the help output might be better UX, or just the associated web docs (where it's also vague). That would make it less guesswork that it's meant to be a value fromhatch python show
. - The CLI
--help
also doesn't show defaults like the web docs do. - The web docs could link to this section perhaps for the supported versions? While the CLI could mention they're listed in
hatch python show
?
These two sections from the web docs are a little insightful about what I was after:
- https://hatch.pypa.io/latest/plugins/environment/virtual/#options
- https://hatch.pypa.io/latest/plugins/environment/virtual/#python-resolution
There's an ENV HATCH_PYTHON
, which doesn't appear to be documented elsewhere? (I tried the docs search box). It mentions a value of self
can be used, which is not valid for --name
or --path
with hatch shell
, but it is as an ENV. This prevents install an extra copy of Python.
hatch shell --name
does not appear to be a name related to a Python version however.
Caution: Extra Python expected by default
The first virtual environment adds about 20MB, subsequent ones around 8MB. If there is no other Python detected, hatch
downloads a new one which seems to add another 150MB? You can avoid that with the HATCH_PYTHON=self
ENV as mentioned above.
du -sx --bytes --si /
263M /
# 3-4MB increase:
$ hatch --version
Hatch, version 1.11.0
$ du -sx --bytes --si /
266M /
# Environment added, 18MB increase:
$ cd /tmp && HATCH_PYTHON=self hatch shell
$ du -sx --bytes --si /
284M /
$ exit
# No excess when using without the ENV:
$ hatch shell
$ du -sx --bytes --si /
284M /
# Different location creates a new environment.
# This time since ENV is omitted it's created by bringing in Python 3.12 again:
$ cd /opt && hatch shell
du -sx --bytes --si /
448M /
Inconsistency within virtual environment due to PATH
ENV
The curl install approach differs from pipx
/ package install in a notable way.
- Perk: You can share
uv
command in the environment without any extra steps (like symlinking). - Con: You can't use
hatch
command within the environment, unless you provide an absolute path to the proper command (/usr/local/bin/hatch
, which was already discoverable in PATH) - These differences apply regardless of
HATCH_PYTHON
(only affects the virtual env), the difference is due to an extra PyApp addition into thePATH
ENV, thushatch
from that location has priority over your actualhatch
binary 🤷♂️
$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
$ cd /tmp && hatch shell --name 3.11
$ python --version
Python 3.12.3
# Fails due to modified PATH:
$ hatch --version
bash: /root/.local/share/pyapp/hatch/1303662642487178586/1.11.0/python/bin/hatch: cannot execute: required file not found
$ /usr/local/bin/hatch --version
Hatch, version 1.11.0
# UV is available however:
$ uv --version
uv 0.1.44
# hatch environment and hatch install location are given precedence for resolving binaries:
$ echo $PATH
/root/.local/share/hatch/env/virtual/opt/y8366zdl/opt/bin:/root/.local/share/pyapp/hatch/1303662642487178586/1.11.0/python/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
dnf install python
adds 50 MB and can also be used by HATCH_PYTHON
ENV instead of self
. This ENV affects the linked python in the virtual env PATH, which adds a symlink to that location. It seems unnecessary though as when Python is already installed on the system already like this, hatch
detects that and will use it by default.
Contrasting to a pipx
/ package install, where python is externally available to hatch, it too will create using that Python by default. You'll find that the PATH
ENV isn't altered in the same way, hatch --version
will work in the environment while uv
will not:
$ dnf install -y pipx && pipx install hatch
$ cd /tmp && hatch shell
$ hatch --version
Hatch, version 1.11.0
$ uv --version
bash: uv: command not found
env | grep PATH
PATH=/root/.local/share/hatch/env/virtual/tmp/6WcazSRI/tmp/bin:/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
I assume this difference isn't intentional?
Summary of prior message
Still a tad long, see the prior message for more details.
GH Releases:
- Hatch releases from v1.11.0 have changed their GH release naming convention.
- The version is no longer part of the filename, only the release tag in the URL.
- The compressed contents is also normalized to
hatch
instead of the filename without.tar.gz
. - These change are great, but something the release didn't draw attention to (or explain the
dist
additions that appear to be 150MB of content similar to the standalone?).
- Only
-gnu
(glibc) is available for ARM64,-musl
only offers AMD64 (x86_64
), it also seems to be broken (fails to install properly, but so doesv1.10.0
).
Standalone installer depends on external python despite bundling it's own:
- Other install methods require Python already (
pipx
, distro package), while the PyApp could use the one that hatch seems to bundle regardless of install method? - When creating an environment and no other Python install exists, rather than using it's own copy, it'll download another Python (aka
hatch python install
). Unless you useHATCH_PYTHON=self
as a workaround, expect an extra 150MB once you usehatch shell
/hatch env
or similar.
Standalone installer prepends it's bin location to PATH
ENV:
- While it's nice to have
uv
available in the environment without extra config, this overrides thehatch
binary with thehatch
python script from it's PyApp install. - As a result you're prevented from using the
hatch
command within a Hatch managed environment. Unlike other install methods where this is a non-issue by only prepending the venv to PATH.
Docs (Web / CLI help) need love:
-
hatch shell
especially has--help
output that is not explaining the--name
and--path
options. There's overlap with these elsewhere but it's not treated the same. Meanwhile the web docs for this command are equally vague. -
hatch python find
+hatch python install
are examples where the CLI output is not very good at communicating what is expected.NAME
must be understood that it's a Python distribution (as per the web docs venv section on the topic).- The user must discover this via the web docs or
hatch python show
which outputs aName
column that can be inferred as what sibling commands want. - CLI help could provide a one-liner example to better communicate what the value of
NAME
implies (eg:3.12
). While the web docs could link to an appropriate section that already covers what is supported (venv plugin page)
- The user must discover this via the web docs or
- While ENV like
HATCH_PYTHON
are briefly mentioned (venv plugin page) so that you can learn about theHATCH_PYTHON=self
, this ENV and others like it don't appear to be documented on the web docs? -
hatch config show
has:- Some settings that don't seem to be documented clearly? (eg:
python = "isolated"
) - While some defaults are also a little perculiar? Like the template defaults:
-
U.N. Owen
appears to be a pun for "unknown" and used as the alias of a muderer from a novel? - The email
[email protected]
doesn't appear to be a reference, it could just be[email protected]
?
-
- Some settings that don't seem to be documented clearly? (eg:
@lwasser I've got a bit to juggle elsewhere, but I'd be happy to review a PR when I can spare the time.
I am not that experienced with Python, but I know Linux and Docker very well! If you've got any questions feel free to reach out 👍
I think most of the info I've covered above doesn't really need to go into the docs. It was more about exploring what options were available and the tradeoffs 😎
- I've revised a
Dockerfile
for you below, it's documented well and should convey what's necessary to get a basic image with hatch setup. - I've added a separate context section below since others interested in docs for Docker may run into similar concerns. One that's common with Docker images is handling deps in a separate layer, although I'd like to try manually install some in advance.
Dockerfile
example
Decisions made:
- Ubuntu is the smallest image from above experiments. It is also a base image choice that most users will be comfortable with as a reference.
-
pipx
is only 5MB larger than thecurl
install approach-
Pro:
pipx
provides a simpler UX? (especially for supportnig both AMD64 + ARM64 builds) -
Con:
pipx
image does take 90s on my system to build, vs 45s for thecurl
approach (or 17s viafedora-minimal
, while it'spipx
equivalent takes 36s). This shouldn't be too much of a concern provided the layer isn't invalidated in future builds (cache mounts can alleviate that if needed).
-
Pro:
- I find the
Dockerfile
below with theHereDoc
feature is easier to grok, I'd encourage choosing that.- Alternatively, I've provided the old technique of running commands within a single
RUN
layer. - I can't recall compatibility for this feature with Docker prior to v23 (Feb 2023) releases. I think the ENV
DOCKER_BUILDKIT=1
may have been required.
- Alternatively, I've provided the old technique of running commands within a single
- Symlinking for
uv
seemed most convenient to manage, while avoiding an extra 30MB.- Personally I'd opt-out of the bundled
uv
inhatch
if I could, andpipx install uv
with theHATCH_UV
ENV set. - Maybe it's ok to delete the bundled
uv
, but that just swapsln
forrm
thus no improvement to theDockerfile
?
- Personally I'd opt-out of the bundled
# syntax=docker.io/docker/dockerfile:1
FROM ubuntu:24.04
RUN <<HEREDOC
# Install pipx, then empty the apt cache:
apt update && apt install -y --no-install-recommends pipx
rm -rf /var/lib/apt/lists/*
# Updates the USER `.bashrc` and `.profile` to append `${HOME}/.local/bin` to $PATH
pipx ensurepath
# Install hatch, then empty the pip cache:
pipx install hatch && rm -rf "${HOME}/.cache/pip"
# Hatch bundles UV, symlink to it to avoid needing `pipx install uv`:
ln -s "${HOME}/.local/share/pipx/venvs/hatch/bin/uv" /usr/local/bin/uv
HEREDOC
Old approach for `RUN`
FROM ubuntu:24.04
RUN apt-get update \
&& apt-get install -y --no-install-recommends pipx \
&& rm -rf /var/lib/apt/lists/* \
&& pipx ensurepath \
&& pipx install hatch \
&& rm -rf "${HOME}/.cache/pip" \
&& ln -s "${HOME}/.local/share/pipx/venvs/hatch/bin/uv" /usr/local/bin/uv
Fedora equivalent (very little difference)
# syntax=docker.io/docker/dockerfile:1
FROM fedora:40
RUN <<HEREDOC
dnf install -y pipx && dnf clean all
pipx ensurepath
pipx install hatch && rm -rf "${HOME}/.cache/pip"
ln -s "${HOME}/.local/share/pipx/venvs/hatch/bin/uv" /usr/local/bin/uv
HEREDOC
Reference: Alternative - Standalone via curl
NOTE: Current caveats apply:
-
hatch
command does not work in a venv due to modifiedPATH
. -
uv
is not symlinked for that same modifiedPATH
reason that makes it available.
# syntax=docker.io/docker/dockerfile:1
FROM ubuntu:24.04
RUN <<EOF
apt update && apt install -y --no-install-recommends curl ca-certificates
rm -rf /var/lib/apt/lists/*
# Grabs the latest release for your arch and extracts it to /usr/local/bin:
HATCH_URL="https://github.com/pypa/hatch/releases/latest/download/hatch-$(uname -m)-unknown-linux-gnu.tar.gz"
curl -sfSL "${HATCH_URL}" | tar -xzO > /usr/local/bin/hatch
# Permit this file to run / execute:
chmod +x /usr/local/bin/hatch
# Installs standalone hatch, then does some cleanup (remove PyApp cache):
hatch self restore && hatch self cache dist --remove
EOF
Fedora equivalent (without the commentary):
- Larger image size (base) than Ubuntu (over 100MB), but faster to build. If you build multiple images for projects that share the same base image layer it's less of an issue.
- This image already has curl already, so no packages to install. Unlike
fedora-minimal
, it already hastar
+gzip
too. -
TIP: Since Hatch v1.11.0, the
tar.gz
files have normalized the compressed filename tohatch
. You could alternatively usetar -xz && mv hatch /usr/local/bin/hatch
instead, nochmod +x
needed, but the original UID and GID may not be compatible for non-root customizations (the GID changed with v1.11.0, UID remains at1001
).
# syntax=docker.io/docker/dockerfile:1
FROM fedora:40
RUN <<EOF
HATCH_URL="https://github.com/pypa/hatch/releases/latest/download/hatch-$(uname -m)-unknown-linux-gnu.tar.gz"
curl -sfSL "${HATCH_URL}" | tar -xzO > /usr/local/bin/hatch
chmod +x /usr/local/bin/hatch
hatch self restore && hatch self cache dist --remove
EOF
Context
As the type of user that'd be interested in such docs when I was looking into Hatch, but also as a user new to Python that wants to run some Github projects in Docker containers - I wanted to know what install process for hatch was going to work best to minimize disk space vs plain pip install
.
- We've pretty much established
pipx
install is still the best choice right now (standalone installer has some caveats remaining, while distro packages are behind in releases to enjoyuv
support). - The availability of the standalone installer (and it's apparent small size on GH releases) did make me wonder if I could use that without
pipx
or Python, so I might have tried it anyway to compare (and then get confused once actually using hatch due to the present issues outlined above). The docs could try emphasizepipx
has the least amount of friction / surprises? 🤷♂️ - I'll be trying Hatch at a later date with UV to run some PyTorch based projects, if I learn anything else from that worth sharing I'll chime in here 👍
An unresolved concern I have is going to be how to handle PyTorch. Deps in hatch.toml
/ pyproject.toml
don't have a clear command to install/sync but instead require hatch shell
/ hatch env run
to trigger that implicitly?
- If I want to "warm" up the cache for UV in advance by installing the 4-5GB torch uses, this should be done in a separate
RUN
layer (or image/stage) before other deps to prevent this data being discarded when something else in the project is updated (hatch.toml
, project source files) which could invalidate the layer. - I'm not sure how hatch (and the virtual environments it manages through UV) are involved in that, it's not something you'd really worry about outside of a container.
- While Docker does have cache mounts which could help with builds (and allow a
hatch.toml
to be present without layer invalidation concerns) - this would prevent using hard links, thus incurring a copy across the mount boundary introduced. Not really a problem when the image is only being built for a single virtual environment using PyTorch, but if I want to have several that may be a concern. - This topic is perhaps more niche / advanced, so it doesn't need to be tackled with the initial Docker guidance, but if someone knows how to approach it that'd be good! Without the cache mount usage, I suppose I could have a separate dummy
hatch.toml
environment to bring these in (or directly runuv venv + uv pip install
, without hatch involved?). The hardlinking feature should take care of the rest I think (if I manage ahatch.toml
for each project, I think they can inherit the same PyTorch environment?). I'll try it when I can :)
# Related UV issue as below will need to handle different "local identifiers":
# https://github.com/astral-sh/uv/issues/3437#issuecomment-2102125794
[envs.default]
type = "virtual"
path = "venv-pytorch"
dependencies = [
"torch==2.3+cu121",
"torchvision",
"torchaudio"
]
installer = "uv"
[envs.default.env-vars]
UV_INDEX_URL = "https://download.pytorch.org/whl/cu121"
I just wanted to check back in here, y'all. I've been swamped with other volunteer commitments, and I won't be able to follow through with the docker PR. I hope that someone else can hop in and work on this, as this issue contains a lot of great information. We are having good success with using and teaching Hatch over at pyOpenSci, so I hope to continue to see the use of and documentation for Hatch grow!