Race condition when two libraries update the same __init__.py
This has had me pulling my hair out.
If pydantic is enabled in the requirements.txt, pyjwt fails with
#12 7.705 Traceback (most recent call last):
#12 7.705 File "<string>", line 1, in <module>
#12 7.706 ImportError: cannot import name 'PyJWKClient' from 'jwt' (/opt/venv/lib/python3.12/site-packages/jwt/__init__.py)
Looks like the init.py doesnt get updated with PyJWKClient in the failed runs, but no indication as to why.
Cant make it fail deterministically but doing docker system prune, it will fail every other (?) time.
Here is the Dockerfile:
FROM python:3.12.3 AS build
ENV PATH="/opt/venv/bin:$PATH"
ENV VIRTUAL_ENV="/opt/venv"
# Python runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
make \
g++ \
gfortran \
cmake && \
python -m venv /opt/venv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv
WORKDIR /app
COPY requirements.txt ./
RUN echo 1 && \
uv pip -v --no-cache sync requirements.txt &&\
python -c "from jwt import PyJWKClient"
Here is the requirements.txt:
# This file was autogenerated by uv via the following command:
# uv pip compile requirements.in pyproject.toml
annotated-types==0.7.0
# via pydantic
anyio==3.7.1
# via
# fastapi
# starlette
certifi==2022.12.7
# via requests
cffi==1.17.0
# via cryptography
charset-normalizer==2.1.1
# via requests
cryptography==43.0.0
# via jwt
# fastapi==0.105.0
# via -r requirements.in
google-search-results==2.4.2
# via locus-ferb (pyproject.toml)
idna==3.4
# via
# anyio
# requests
jwt==1.3.1
# via locus-ferb (pyproject.toml)
pycparser==2.22
# via cffi
pydantic==2.8.2
# via fastapi
pydantic-core==2.20.1
# via pydantic
pyjwt==2.9.0
# via locus-ferb (pyproject.toml)
requests==2.28.1
# via
# google-search-results
# serpapi
serpapi==0.1.5
# via locus-ferb (pyproject.toml)
sniffio==1.3.1
# via anyio
# starlette==0.27.0
# via fastapi
typing-extensions==4.9.0
# via
# fastapi
# pydantic
# pydantic-core
urllib3==1.26.13
# via requests
Attached are the two logs, running the same command one after the other:
docker system prune && docker build --ssh default --progress plain -f Dockerfile-fail .
build-working.log
build-failing.log
Well, in the both examples there are things like:
#12 7.601 DEBUG File already exists (subsequent attempt), overwriting: /opt/venv/lib/python3.12/site-packages/jwt/__init__.py
It looks like there's a conflict between two packages and you're seeing a race condition where it succeeds or fails depending on which one overwrites the file last.
That makes sense, pyjwt adds functionality to jwt, so its update to the init.py should probably come after. Is there a way to force a dependency? If it were possible to do, would that fix the race condition?
I'll check to see if we can remove jwt, and only keep pyjwt. But I see the same issue for another library that seems to have the same issue, in that case I know I need both libraries.
Or maybe I have to do a sync, and then separately install the 2nd library with uv pip install?
The race condition doesn't happen in pip, because of its slowness.
I see. I don't know if it's feasible for us to unpack packages in a defined order like that, @charliermarsh would know better.
It seems like pretty bad behavior for a package to rely on patching an existing one by overwriting its files. I think in this case, you'd be best off running two separate install operations — one to prepare the environment, and the next to patch the package.
Could the warning be surfaced, so it's clearer that one is clobbering the other?
Agreed on the package behavior, but such is life with dependence on third party libs...
And I take back that it may not happen for us in pip, but I see articles about the import not working. https://github.com/jpadilla/pyjwt/issues/685
https://stackoverflow.com/questions/68912059/module-not-found-error-when-using-google-search-api-serpapi
Also, thank you @zanieb for identifying the issue so quickly.
After looking at what my options around working around this while sticking with "sync" type workflow, and was hoping to use uv lock and then uv sync --no-install-package pyjwt instead of uv pip sync but....
The project is marked as unmanaged
Do you have unmanaged = true set anywhere?
I do (I set managed=false in the pyproject.toml) - because our existing docker images use /opt/venv, and was trying to avoid changing that. But I think I will move stuff into /app/.venv so can use uv sync instead of uv pip.
Ok I have it working with this Dockerfile and uv.lock
ROM python:3.12.3 AS build
ENV PATH="/app/.venv/bin:$PATH"
ENV VIRTUAL_ENV="/app/.venv"
# Python runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
make \
g++ \
gfortran \
cmake
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv
WORKDIR /app
COPY uv.lock pyproject.toml ./
RUN RUST_LOG=trace uv -v --no-cache sync --no-install-package pyjwt --no-install-package google-search-results &&\
RUST_LOG=trace uv -v --no-cache sync &&\
python -c "from jwt import PyJWKClient"
I would like to request a feature for future uv noobs, that warns when a library clobbers another one, I spent a fair bit of time going thru a number of permutations thinking there was something going on in my environment/cache/docker setup.
@konstin is it feasible for us to use warn_user_once when installing a package clobbers files?
I'm honestly not sure if we should, because I think there are packages that do this somewhat intentionally.
The cases i looked where __init__.py happened intentionally it was the __init__.py in both packages. We should show a warning when they are not the same, in which case we know there is a race condition. (We have to read the existing file for that, so we'll need to use Locks after io::ErrorKind::AlreadyExists to make sure the read is non-raced)
I think this one does do it intentionally but if it's installed in the same invocation as its target package that's a big problem since we don't do it in a deterministic order?
could there be an option - uv add <package> --depends_on <main_package> which would update the uv.lock to install that package after the main_package?
There could be but it's a pretty niche use-case and controlling the install order manually seems weird. I feel like this is a problem with the design of these packages, not a common use-case we should add configuration options for.
Hi there. I am revisiting my Dockerfile build as we have added some more libraries, and have found a few more instances of this race condition happening. I was wondering if there has been any other improvements/configuration settings in newer versions that might mitigate this issue...