meson-python
meson-python copied to clipboard
Implement prepare_metadata_for_build_wheel
We should implement this method, since if it is unavailable, pip will call build_wheel to find the metadata.
Since we don't support editable installs, if a project supports both meson and setuptools as a build system, the project will be built unnecessarily with meson before proceeding with an editable install with setup.py.
In order to do this, we just need to write .dist-info with only METADATA and WHEEL (no need for RECORD0.
Where is the fall-back from a pep517-build to a setuptools-build when the pep517 backend does not support editable install documented? At a quick look I don't see it in pep517. If the installation is going to proceed with setuptools, what's the purpose of obtaining metadata from a different pep517 backend? Isn't it going to be discarded anyway?
if a project supports both meson and setuptools as a build system
I wanted this for SciPy, but it is impossible by design, pip doesn't allow it. You can make duplicate sets of setup.py and meson.build files in your source tree, but you need a one line patch to switch build backends. Using pip install --no-pep-517 does not work.
Since we don't support editable installs, if a project supports both meson and setuptools as a build system, the project will be built unnecessarily with meson before proceeding with an editable install with setup.py.
We should support it soon. But other than that, is this actually happening? That seems like a pip bug, since they have on purpose not even made --no-pep-517 work.
The only way I know to make this work is to call python setup.py develop explicitly.
When we released SciPy 1.9.0 the sdist had build-backend = 'mesonpy', and I told all distributors that still had trouble with building (e.g. due to less flexible BLAS support) to carry a patch to comment out that one line, and then things do work as expected with setuptools.
@rgommers Sure. I meant that calling setup.py directly is the only way without patching pyproject.toml. Of course if you change the build backend declaration, the right backed is picked up by the frontend.
if a project supports both meson and setuptools as a build system
I wanted this for SciPy, but it is impossible by design,
pipdoesn't allow it. You can make duplicate sets ofsetup.pyandmeson.buildfiles in your source tree, but you need a one line patch to switch build backends. Usingpip install --no-pep-517does not work.
pip is actually smart enough to fall back to setup.py, when it can't find the build_editable hook.
But I guess this doesn't matter much since the editable installs PR got updated. I'm happy to wait for that instead if its landing soon.
Implementing this is possible, but the commitment to use the metadata from the prepare_metadata_for_build_wheel hook is annoying, and makes things a bit more complicated.
Actually, we could consider simply checking the metadata from the prepare_metadata_for_build_wheel hook against the metadata we were going to generate and error out if they differ.
To be clear, this can be implemented, and would be a nice improvement, just not really high-priority right now, so I am re-opening.
The specific workflow issue mentioned is solved by editable installs, but the proposal itself is still relevant.
Can you explain why it's an improvement? What does it enable that isn't possible (or slower, or ...) now?
Fetching the metadata for the package. See https://pypa-build.readthedocs.io/en/stable/api.html#build.util.project_wheel_metadata, and https://github.com/jaraco/jaraco.packaging as an example use-case. This is especially relevant in our build backend because we are mainly target compiled, often big or even very big, projects, and having to compile the full project to get the metadata is not great.
That example use case uses setuptools_scm. SciPy, Pandas et al. also have a dynamic version - at that point I think this hook can do nothing else but build the wheel anyway, right? This new hook will return metadata without building if and only if there is no dynamic field usage?
Then the second question is still, what is the actual use case. Does this hook get used in situations where a wheel does not get built? If so, it's clear to me - that saves an expensive build operation.
No, I mean look at what that project does, not the build backend it uses. It fetches the metadata and injects some of the fields in sphinx.
This new hook will return metadata without building if and only if there is no
dynamicfield usage?
No, it will, and that's one of the major reasons to fetch the metadata from the backend instead of reading pyproject.toml directly. The main issues with reading pyproject.toml directly are that 1) not every backend uses it, and 2) there may be dynamic fields.
Then the second question is still, what is the actual use case. Does this hook get used in situations where a wheel does not get built? If so, it's clear to me - that saves an expensive build operation.
If this hook is available, build.util.project_wheel_metadata will use it to generate the metadata, if it is not, it will use build_wheel and extract the metadata from it. This is the recommended behavior from PEP 517.
https://peps.python.org/pep-0517/#prepare-metadata-for-build-wheel
If a build frontend needs this information and the method is not defined, it should call
build_wheeland look at the resulting metadata directly.
So yeah, it will save an expensive operation, and that is the use-case.
It fetches the metadata and injects some of the fields in sphinx.
If that's the use case, it seems misguided. The metadata is for building a wheel only. That doesn't have to be the same metadata as for the project as a whole, for other wheels or for the sdist. Conceptually, that package does the wrong thing. If it wants package/sdist metadata, it should not be asking for wheel metadata. It's also not a build frontend, so it should not be calling this hook.
The main issues with reading
pyproject.tomldirectly are that 1) not every backend uses it, and 2) there may be dynamic fields.
AFAIK, for meson-python we have no other place to put metadata except in pyproject.toml. And we do not want to support anything else. It
If this hook is available,
build.util.project_wheel_metadatawill use it to generate the metadata, if it is not, it will usebuild_wheeland extract the metadata from it. This is the recommended behavior from PEP 517.
I know that, but for meson-python, I am asking why this matters. I think our situation is:
- metadata lives in
pyproject.toml - if there are no dynamic fields, for this new hook we'd read
pyproject.tomland return the metadata - if there are dynamic fields, we'd run the whole build
So it seems to me that it matters little whether we implement this hook or not. Maybe it improves terminal output?
If a build frontend needs this information and the method is not defined, it should call
build_wheeland look at the resulting metadata directly.So yeah, it will save an expensive operation, and that is the use-case.
Can you go one level deeper, to an actual real-world use case? Is there a case where this information is needed by a build frontend, and then it ends up not building the wheel right after?
I'm not sure implementing this hook can even be done correctly without building; PEP 517 says it must be a valid .dist-info/ directory except the RECORD file may be missing. So that implies that, for example, the WHEEL file must be present. meson-python uses build time heuristics to determine the contents of that file, so it is quite hard to generate it without doing any building. I suspect that the PEP 517 authors intended for .dist-info/ to only contain METADATA, but the way it's written requires WHEEL.
So it seems to me that it matters little whether we implement this hook or not. Maybe it improves terminal output?
Yes. We will get
Preparing editable metadata (pyproject.toml) ... error
error: metadata-generation-failed
if anything goes wrong during build, which is not the most accurate error message, and can be confusing.
Other than that, I don't think there's too much other benefit to this, other than this being the right thing to do.
Can you go one level deeper, to an actual real-world use case? Is there a case where this information is needed by a build frontend, and then it ends up not building the wheel right after?
Then the second question is still, what is the actual use case. Does this hook get used in situations where a wheel does not get built? If so, it's clear to me - that saves an expensive build operation.
After some re-reading this, I think our conclusion here is still correct. However, some PyPA folks are of the opinion that this hook is a general one that they're allowed to call if they want any metadata for the project. This seems clearly wrong though - the name of the hook makes this abundantly clear. And the call producing a .dist-info/ directly, in addition to possibly triggering a full build, makes it a terrible way of querying metadata. So on the one hand, I wouldn't want to encourage such off-label usage. On the other hand, it's not unlikely that some project (maybe even pip) may start doing so anyway.
Furthermore, replying to thoughts from @dnicolodi in https://github.com/mesonbuild/meson-python/pull/478#issuecomment-1704989906:
To produce metadata we need to know if the wheel is pure or not (ie if it included platform or python version specific content) and for doing that we need to configure, and in some occasions, actually build the project.
Yes, good point. We don't know that statically (at least not right now, and we'd like to avoid asking package authors to define that in a setting). Running meson setup requires a dev setup etc., which is expensive.
PEP 660 is not clear on what
prepare_metadata_for_build_editableis used for by the front-end. I fear that some front-ends may executeprepare_metadata_for_build_editablebeforebuild_editable, just in case. This would result inmeson-pythondoing work twice for no good reason. Lacking an use case requiringprepare_metadata_for_build_editable(orprepare_metadata_for_build_wheelas requested in #236), I much prefer not to implement these methods.
IIRC that hook had to do specifically with some funky requirement for installing extra runtime dependencies if and only if the build is editable (a la https://pypi.org/project/editables/). It doesn't seem healthy to me, and we don't need this for meson-python.
if anything goes wrong during build, which is not the most accurate error message, and can be confusing.
Other than that, I don't think there's too much other benefit to this
This assessment is still correct I think. Given that there's a fair bit of complexity involved and that it doesn't look like we can avoid running meson setup, I think this may not be worth doing. So I think I agree that we can close this issue @dnicolodi.
In case someone shows up with a clean implementation that doesn't impose extra maintenance overhead, we can still merge that to get the nicer error message on build failure.
In case someone shows up with a clean implementation that doesn't impose extra maintenance overhead, we can still merge that to get the nicer error message on build failure.
I think we got rid of the ugly error messages already. At least I haven't seen an ugly pip splat in a while.
We got rid of the long tracebacks, but the end result is still a bit misleading as @lithomas1 pointed out. It looks like this now:
ninja: build stopped: subcommand failed.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details
It's failing with a compile error, and pip unhelpfully reports "metadata-generation-failed" because we're doing a full build when pip has just called prepare_metadata_for_build_wheel. A very minor issue at this point, but still just a tiny bit off.
Ah. I see what you mean now. Is pip calling prepare_metadata_for_build_wheel through some wrapper? meson-python does not export that hook, thus pip may know that it is not just generating metadata, and fix the error message.
Ah right - yes, this is arguably a pip bug. It is calling build_wheel. I don't think there's a wrapper involved.
Then the second question is still, what is the actual use case. Does this hook get used in situations where a wheel does not get built? If so, it's clear to me - that saves an expensive build operation.
I think I can provide a more practical use case. In Fedora Linux, we use this hook to generate RPM BuildRequires dynamically with the %pyproject_buildrequires macro; see https://src.fedoraproject.org/rpms/pyproject-rpm-macros. We generally want the runtime Requires to be BuildRequires too—because that keeps us from building a package that will fail to install due to missing dependencies, and because we want to run at least an import-only “smoke test” and preferably the entire test suite as part of the RPM build process.
It’s possible to work around a build backend that doesn’t support the hook by passing the -w option to %pyproject_buildrequires, but this of course requires building the wheel, which is a lot of effort for something like scipy. There is some relevant downstream discussion in https://src.fedoraproject.org/rpms/scipy/pull-request/31.
Also potentially relevant is https://github.com/pypa/hatch/issues/128, in which this was discussed (and eventually implemented) for hatchling.
In Fedora Linux, we use this hook to generate RPM
BuildRequiresdynamically with the%pyproject_buildrequiresmacro
That does not look like a valid use of this hook, and does not seem like a good idea to me. On Linux, meson-python has dependencies on ninja and patchelf, and it dynamically inspects if the system already has them installed. So if you're using the hook to look for build-time dependencies, things go in circles.
More generally, this cannot work any time a package has a system (non-PyPI) build dependency. You may get a lot further after http://peps.python.org/pep-0725 if that gets accepted and widely implemented. But for now, your macro has a big gap in its design.
When all dependencies are static, you can read pyproject.toml directly. When they're not, you cannot get a build env set up this way no matter what you do.
It’s possible to work around a build backend that doesn’t support the hook by passing the
-woption to%pyproject_buildrequires, but this of course requires building the wheel, which is a lot of effort for something likescipy. There is some relevant downstream discussion in https://src.fedoraproject.org/rpms/scipy/pull-request/31.
I am actually very interested in improving the way distros translate metadata from pyproject.toml to their own formats and generate build recipes. It is possible I think to make that pretty reliable - it was one of the motivations for PEP 725. But the correct way to do this is to read pyproject.toml directly. Calling this hook triggers a full build very often - as you found out - because if anything is dynamic (version is quite common) the build backend has little choice but to do a full build.
I read it more carefully, including all the warnings and disclaimers in https://src.fedoraproject.org/rpms/pyproject-rpm-macros about double builds etc. The design of %pyproject_buildrequires is clearly suboptimal - it really should read the dependencies = [...] entry directly from pyproject.toml. And only if that is listed as dynamic, then call this hook. For SciPy specifically, that would solve the problem. Also, SciPy has zero build requirements that are not also runtime requirements, so you can just leave that macro out completely.
Hello. The designer of the RPM macro here.
We know it cannot list system dependencies and we live with that. It simply generates dependencies on Python packages. We also know it goes through several loops. And we are fine with that.
Is it suboptimal? Maybe. But it works quite reliably and well by following the standardized PEP 517/PEP 518 protocols. If we parse pyproject.toml manually for project.depndencies first, we diverge from that protocol.
The macro actually works quite well when build backends have prepare_metadata_for_build_wheel. It's just when they don't and we need to build the wheel in order to read the metadata (as PEP 517 says), large projects like scipy make that suboptimal, because we might end up building the wheel multiple times (due to technicalities of how RPMs are built).
If we parse pyproject.toml manually for
project.depndenciesfirst, we diverge from that protocol.
You really don't if you implement what I said above. When dependencies is static, the list is guaranteed to not change so it's perfectly safe to read it directly. pip is likely also going to start doing that in some cases. There is no risk of divergence/mismatch here, it's strictly an improvement.
I suppose it is strictly an improvement. However, it is not a requirement. I understood your comment as "you are doing it wrong".
I've opened an RFE for this.
BTW @gotmax23 pointed out that not all build backends necessarily understand/support/use the pyproject.toml [project] table (originally PEP 621) and reading the runtime dependencies from project.dependencies (or assuming there are none if the list is absent from an existing [project] table) is potentially unsafe.