build icon indicating copy to clipboard operation
build copied to clipboard

build: add circular dependency checker for build requirements

Open jameshilliard opened this issue 4 years ago • 6 comments

Implement a basic build requirement cycle detector per PEP-517:

  • Project build requirements will define a directed graph of requirements (project A needs B to build, B needs C and D, etc.) This graph MUST NOT contain cycles. If (due to lack of co-ordination between projects, for example) a cycle is present, front ends MAY refuse to build the project.

  • Front ends SHOULD check explicitly for requirement cycles, and terminate the build with an informative message if one is found.

See: https://www.python.org/dev/peps/pep-0517/#build-requirements

jameshilliard avatar Dec 12 '21 14:12 jameshilliard

I have some concerns, in decreasing order of abstraction:

  1. I'm not sure if the quoted requirement from PEP 517 should (or can) apply to build. Firstly, the name of the project being built cannot be extracted reliably without triggering a build, i.e. without having previously installed its build-requires. Secondly, build does not do nested builds. For a project to self-depend, either directly or indirectly, it needs to be built by what the PEP calls an integration frontend, e.g. a package manager. It would then follow that it's the integration frontend's responsibility to resolve circular dependencies.
  2. Extracting project metadata is, arguably, outside the scope of a build frontend. The name should be provided as input to the build frontend if we're expected to operate on it.
  3. An implementation based on PEP 621 metadata will have a very short reach. PEP 621 is a new standard and which is yet to be adopted by either setuptools or Poetry.
  4. There are things the implementation here does not account for, like dependency name normalisation, dynamic names and circular dependencies from get_requires.

layday avatar Dec 13 '21 10:12 layday

I'm not sure if the quoted requirement from PEP 517 should (or can) apply to build. Firstly, the name of the project being built cannot be extracted reliably without triggering a build, i.e. without having previously installed its build-requires.

Yeah, I refactored this a bit to do post build name extraction and validation in cases where PEP 621 names aren't available.

Secondly, build does not do nested builds. For a project to self-depend, either directly or indirectly, it needs to be built by what the PEP calls an integration frontend, e.g. a package manager. It would then follow that it's the integration frontend's responsibility to resolve circular dependencies.

The purpose of this is to prevent packages from being built that have circular dependencies, currently build is allowing invalid projects(ones with dependency cycles) to be built without any indication that something is wrong. This is supposed to catch dependency cycle issues before a package is published as other tools(integration frontends/package managers) may often be entirely unable to resolve circular dependencies.

Extracting project metadata is, arguably, outside the scope of a build frontend. The name should be provided as input to the build frontend if we're expected to operate on it.

So project name can be now manually set via build.project_name, although I have things so that build will opportunistically set the self.project_name if that information becomes easily available at any point during any part of the build.

An implementation based on PEP 621 metadata will have a very short reach. PEP 621 is a new standard and which is yet to be adopted by either setuptools or Poetry.

Yeah, I now have it so it shouldn't depend on PEP 621 metadata to be able to do the cycle check, but it should use PEP 621 metadata when available as it allows for earlier detection of dependency cycles prior to a build.

There are things the implementation here does not account for, like dependency name normalisation, dynamic names and circular dependencies from get_requires.

Dependency name normalization should now be handled when comparing dependencies for cycle checks.

I think this can be fairly easily extended to do checks like the ones for get_requires as well, I just started with checking the dependencies that don't depend on a specific distribution since those seem to be the most problematic.

jameshilliard avatar Dec 17 '21 17:12 jameshilliard

I've further extended the dependency checker to check dependencies originating from the build-backend, this now catches the invalid dependency cycle in the wheel PEP517 build:

ERROR Failed to validate `build-system` in pyproject.toml, dependency cycle detected: `wheel` -> `setuptools.build_meta` -> `wheel`

This also catches the(soon to be fixed once flit_core vendors tomli) invalid dependency cycle:

ERROR Failed to validate `build-system` in pyproject.toml, dependency cycle detected: `tomli` -> `flit_core>=3.2.0,<4` -> `tomli`

jameshilliard avatar Dec 27 '21 02:12 jameshilliard

The code doesn't check for cycles. It checks whether something else depends on the project under build, specifically. The PEP says that the build dependency graph must be acyclic.

It's ok if we don't want to validate the graph in whole, just want to make sure we're aware and ok with that before continuing with the review.

layday avatar Dec 27 '21 09:12 layday

The code doesn't check for cycles. It checks whether something else depends on the project under build, specifically.

Yeah, it currently just checks for a specific type of cycle, a cycle where the project depends on itself.

The PEP says that the build dependency graph must be acyclic.

This is mostly designed to catch accidental self cycles before a package author publishes a package.

It's ok if we don't want to validate the graph in whole, just want to make sure we're aware and ok with that before continuing with the review.

Yeah, I'm aware it's not covering all cases, I figured this was a good starting point though as this catches some of the most problematic cases.

jameshilliard avatar Dec 28 '21 16:12 jameshilliard

I've updated this to also now better detects self-cycles in situations using get_requires and fixes some tests.

jameshilliard avatar Jan 10 '22 02:01 jameshilliard

Doing some cleaning of the PR queue, this seems to have been open for a long time. Sorry @jameshilliard. Feel free to ping me directly, as often as you'd like, if this ever happens in the future.

I'll close this as there are several conflicts with the current main, and I don't think this is a particularly important issue.

Realistically, I think a circular dependency with the current version being built can only happen if someone downstream add a dependency before the package they're depending on is released, no? This would also make it so that the latest release of the downstream package won't work until the new version of dependency is released.

Other circular dependencies should already be handled by pip, which we use to provision the build environment, AFAICT.

FFY00 avatar Mar 20 '23 22:03 FFY00

Realistically, I think a circular dependency with the current version being built can only happen if someone downstream add a dependency before the package they're depending on is released, no?

That wasn't the issue this is supposed to catch from my understanding.

This would also make it so that the latest release of the downstream package won't work until the new version of dependency is released.

This is less related to versioning and is rather intended to catch cases where packages accidentally have pep517 violating circular build dependencies.

Other circular dependencies should already be handled by pip, which we use to provision the build environment, AFAICT.

Well the issue is that pip doesn't properly detect circular build system dependencies for pure python package dependencies that have wheels available. This means that a package which may build with pip will fail with other build systems that effectively enforce the pep517 no circular dependencies rule, for example buildroot's python infrastructure has a no circular dependencies requirement as well as a no wheels requirement.

jameshilliard avatar Mar 20 '23 22:03 jameshilliard

Doing some cleaning of the PR queue, this seems to have been open for a long time. Sorry @jameshilliard. Feel free to ping me directly, as often as you'd like, if this ever happens in the future.

I'll close this as there are several conflicts with the current main, and I don't think this is a particularly important issue.

@FFY00 Rebased on top of main.

jameshilliard avatar Mar 20 '23 23:03 jameshilliard

I can't re-open the PR for some reason, can you? If not, just open a new PR. I'll try to make sure I review it, but feel free to ping me.

FFY00 avatar Mar 21 '23 20:03 FFY00

You can't reopen a PR that has been force pushed while closed. But you can make a new PR.

henryiii avatar Mar 21 '23 21:03 henryiii

You can if you force push it back to what the commit sha1 was at the time of it being closed.

eli-schwartz avatar Mar 21 '23 21:03 eli-schwartz

Reopened as #593.

jameshilliard avatar Mar 21 '23 21:03 jameshilliard