cibuildwheel icon indicating copy to clipboard operation
cibuildwheel copied to clipboard

Avoid excess copies of full project directory

Open jbms opened this issue 3 years ago • 7 comments

I see that the entire source tree is copied a few different times, and this can add significant overhead to the build time when it contains e.g. a large node_modules directory, or other directories with a large number of files.

  1. The manylinux builder seems to copy the project directory rather than just mounting it.

  2. The wheels are built with pip wheel, which copies the entire project directory. Ideally pip wheel wouldn't do this (https://github.com/pypa/pip/issues/7555), but I'm not sure that is going to be resolved anytime soon. As a workaround, you can avoid that by running python setup.py bdist_wheel instead.

Of course, setuptools/distutils by default behaves pretty badly with in-tree builds and litters the source tree with intermediate files that can cause problems on subsequent builds. However, for my own projects I have implemented workarounds to avoid that and it would be nice to have the option in cibuildwheel to avoid the additional copies.

jbms avatar Nov 25 '20 20:11 jbms

  1. The manylinux builder seems to copy the project directory rather than just mounting it.

On some CIs, (I know Circle, not sure about the others), mounting doesn't work because the Docker daemon is running on a different machine. One possibility, as an optimisation, we could change that to something like docker.mount_or_copy_into(...) and perform a mount on CIs that support it.

  1. The wheels are built with pip wheel, which copies the entire project directory. Ideally pip wheel wouldn't do this (pypa/pip#7555), but I'm not sure that is going to be resolved anytime soon. As a workaround, you can avoid that by running python setup.py bdist_wheel instead.

I don't remember the details, but last time this came up, we concluded that python setup.py bdist_wheel isn't particularly future-proof. I think Pip has a flag to disable build isolation, could that help? PIP_NO_BUILD_ISOLATION or something like that. Otherwise, we might be moving towards python-build #442 soon, perhaps that would change things?

joerick avatar Nov 26 '20 13:11 joerick

PIP_NO_BUILD_ISOLATION controls whether pip creates a virtualenv containing just the specified build dependencies, but that is independent of whether it copies the source tree.

jbms avatar Nov 26 '20 15:11 jbms

This might affect us: https://github.com/pypa/pip/issues/7555 pip 21.3 is the target and it's experimentally available now via a feature flag.

henryiii avatar May 13 '21 18:05 henryiii

We could wait until 21.3, but otherwise we would need an option in cibuildwheel I think to specify --use-feature=in-tree-build

jbms avatar May 13 '21 19:05 jbms

Pip 21.3 is out, but a bit stuck on manylinux not being able to update due to travis issues https://github.com/pypa/manylinux/pull/1207 - but you don't ever need options added to cibuildwheel for Pip, pip can eb controlled through environment variables. PIP_USE_FEATURE=in-tree-build, I think.

henryiii avatar Oct 13 '21 14:10 henryiii

Pip 21.3 is now the default, so this is half way optimized (and fully optimized on non-linux).

henryiii avatar Nov 28 '21 16:11 henryiii

At this point, I do wonder if we were to change to a mount-or-copy model, some users might see weird build errors from old build artifacts lying around. Because before, the different architectures and musl/glibc were isolated, but now they're all coexisting in the same build tree.

joerick avatar Apr 01 '23 15:04 joerick