pixi icon indicating copy to clipboard operation
pixi copied to clipboard

bug(build): `pixi build` accumulates bytecode files across different python versions when using a python build variant

Open cpcloud opened this issue 3 months ago • 4 comments

Checks

  • [x] I have checked that this issue has not already been reported.

  • [x] I have confirmed this bug exists on the latest version of pixi, using pixi --version.

Reproducible example

You need my pixi-build branch of numba-cuda

Commands I ran and their output:

pixi build

pixi.toml/pyproject.toml file that reproduces my issue: It's included in the branch I link to above

pixi info output:

❯ pixi info
System
------------
       Pixi version: 0.59.0
           Platform: linux-64
   Virtual packages: __unix=0=0
                   : __linux=6.12.50=0
                   : __glibc=2.40=0
                   : __cuda=13.0=0
                   : __archspec=1=zen2
   Config locations: No config files found

Global
------------
I'd rather not include this

Workspace
------------
               Name: numba-cuda
       Last updated: 03-11-2025 10:58:01

Environments
------------
        Environment: default
           Features: default
           Channels: conda-forge
   Dependency count: 1
       Dependencies: numba-cuda
   Target platforms: linux-64, linux-aarch64, win-64
              Tasks: test, simtest, build-tests, bench, benchcmp, clean-tests

        Environment: cu-12-0
           Features: cu-12-0, test, cu, cu-12, default
        Solve group: cu-12-0
           Channels: conda-forge
   Dependency count: 16
       Dependencies: cuda-version, make, pre-commit, psutil, cffi, pytest, pytest-xdist, pytest-benchmark, cuda-nvcc, cuda-nvcc-impl, cuda-cuobjdump, cuda-nvrtc, libnvjitlink, cuda-cccl, libcurand, numba-cuda
  PyPI Dependencies: ml_dtypes, filecheck
   Target platforms: linux-aarch64, win-64, linux-64
System requirements: cuda = "12"
              Tasks: test, simtest, build-tests, bench, benchcmp, clean-tests

        Environment: cu-12-2
           Features: cu-12-2, test, cu, cu-12, nvvm, default
        Solve group: cu-12-2
           Channels: conda-forge
   Dependency count: 17
       Dependencies: cuda-version, make, pre-commit, psutil, cffi, pytest, pytest-xdist, pytest-benchmark, cuda-nvcc, cuda-nvcc-impl, cuda-cuobjdump, cuda-nvrtc, libnvjitlink, cuda-cccl, libcurand, cuda-nvvm, numba-cuda
  PyPI Dependencies: ml_dtypes, filecheck
   Target platforms: linux-64, win-64, linux-aarch64
System requirements: cuda = "12"
              Tasks: test, simtest, build-tests, bench, benchcmp, clean-tests

        Environment: cu-12-8
           Features: cu-12-8, test, cu, cu-12, cu-rt, nvvm, default
        Solve group: cu-12-8
           Channels: conda-forge
   Dependency count: 18
       Dependencies: cuda-version, make, pre-commit, psutil, cffi, pytest, pytest-xdist, pytest-benchmark, cuda-nvcc, cuda-nvcc-impl, cuda-cuobjdump, cuda-nvrtc, libnvjitlink, cuda-cccl, libcurand, cuda-runtime, cuda-nvvm, numba-cuda
  PyPI Dependencies: ml_dtypes, filecheck
   Target platforms: linux-64, win-64, linux-aarch64
System requirements: cuda = "12"
              Tasks: test, simtest, build-tests, bench, benchcmp, clean-tests

        Environment: cu-12-9
           Features: cu-12-9, test, bench, cu, cu-12, cu-rt, nvvm, default
        Solve group: cu-12-9
           Channels: conda-forge
   Dependency count: 22
       Dependencies: cuda-version, make, pre-commit, psutil, cffi, pytest, pytest-xdist, pytest-benchmark, pytorch, pytorch-gpu, libtorch, cupy, cuda-nvcc, cuda-nvcc-impl, cuda-cuobjdump, cuda-nvrtc, libnvjitlink, cuda-cccl, libcurand, cuda-runtime, cuda-nvvm, numba-cuda
  PyPI Dependencies: ml_dtypes, filecheck
   Target platforms: linux-aarch64, linux-64, win-64
System requirements: cuda = "12"
              Tasks: test, simtest, build-tests, bench, benchcmp, clean-tests

        Environment: cu-13-0
           Features: cu-13-0, test, cu, cu-13, cu-rt, nvvm, default
        Solve group: cu-13-0
           Channels: conda-forge
   Dependency count: 18
       Dependencies: cuda-version, make, pre-commit, psutil, cffi, pytest, pytest-xdist, pytest-benchmark, cuda-nvcc, cuda-nvcc-impl, cuda-cuobjdump, cuda-nvrtc, libnvjitlink, cuda-cccl, libcurand, cuda-runtime, cuda-nvvm, numba-cuda
  PyPI Dependencies: ml_dtypes, filecheck
   Target platforms: linux-64, linux-aarch64, win-64
System requirements: cuda = "13"
              Tasks: test, simtest, build-tests, bench, benchcmp, clean-tests

Issue description

When running pixi build, each new package is much bigger than the last:

Here's some ls output:

.rw-------  2.4M cloud  3 Nov 11:10   numba-cuda-0.20.0-py310h8c4f31c_0.conda
.rw-------  7.5M cloud  3 Nov 11:11   numba-cuda-0.20.0-py311h43a39b2_0.conda
.rw-------   12M cloud  3 Nov 11:11   numba-cuda-0.20.0-py312h2078e5b_0.conda
.rw-------   17M cloud  3 Nov 11:13   numba-cuda-0.20.0-py313h7813266_0.conda

This seems to be because each package includes the previously-built package's .pyc files:

❯ tar tf pkg-numba-cuda-0.20.0-py310h8c4f31c_0.tar | rg '\.pyc$' -c
435
❯ tar tf pkg-numba-cuda-0.20.0-py311h43a39b2_0.tar | rg '\.pyc$' -c
1305
❯ tar tf pkg-numba-cuda-0.20.0-py312h2078e5b_0.tar | rg '\.pyc$' -c
2175
❯ tar tf pkg-numba-cuda-0.20.0-py313h7813266_0.tar | rg '\.pyc$' -c
3045

There's also a weird phenomenon where not only are the old pyc files included in their own site packages directory, the new pyc files are placed in the site packages directory of the previous build, leading to invalid paths like

lib/python3.10/site-packages/numba_cuda/numba/cuda/cext/__pycache__/__init__.cpython-311.pyc

showing up in the package, note the -311.pyc file in a python3.10/site-packages directory.

This is why each successive count is 435 * 2 = 870 files more than the last (435 for the inclusion of the old-and-valid-in-the-previous-build paths, and 435 for the old-and-invalid-in-the-current-build paths).

Expected behavior

I would expect that any pyc files that are not relevant for the current python build-variant are not included in the build.

cpcloud avatar Nov 03 '25 16:11 cpcloud

If I build again, while all these pyc files are still laying around, the build continues to grow even more:

After first pixi build invocation

❯ ls *.conda
Permissions Size User  Date Modified Name
.rw-------  2.4M cloud  3 Nov 11:10   numba-cuda-0.20.0-py310h8c4f31c_0.conda
.rw-------  7.5M cloud  3 Nov 11:11   numba-cuda-0.20.0-py311h43a39b2_0.conda
.rw-------   12M cloud  3 Nov 11:11   numba-cuda-0.20.0-py312h2078e5b_0.conda
.rw-------   17M cloud  3 Nov 11:13   numba-cuda-0.20.0-py313h7813266_0.conda

After second pixi build invocation

❯ ls *.conda
Permissions Size User  Date Modified Name
.rw-------   15M cloud  3 Nov 11:33   numba-cuda-0.20.0-py310h8c4f31c_0.conda
.rw-------   17M cloud  3 Nov 11:34   numba-cuda-0.20.0-py311h43a39b2_0.conda
.rw-------   16M cloud  3 Nov 11:35   numba-cuda-0.20.0-py312h2078e5b_0.conda
.rw-------   17M cloud  3 Nov 11:36   numba-cuda-0.20.0-py313h7813266_0.conda

cpcloud avatar Nov 03 '25 16:11 cpcloud

Thanks for reporting this! My first guess is that we need to exclude *pyc files from our input globs - I will take a look at it

nichmor avatar Nov 04 '25 08:11 nichmor

Thanks for reporting this! My first guess is that we need to exclude *pyc files from our input globs - I will take a look at it

*.pyc pattern is already present in exclude section of pyproject.toml, I think the correct solution would be why this is not respected, not adding a special case.

sieciobywatel-ng avatar Nov 05 '25 15:11 sieciobywatel-ng

FWIW, I am not using pyproject.toml, so for my use case this would have to work without one.

cpcloud avatar Nov 05 '25 15:11 cpcloud