rules_python icon indicating copy to clipboard operation
rules_python copied to clipboard

feat(bzlmod): allow overriding TOOL_VERSIONS and base_url

Open aignas opened this issue 1 year ago • 3 comments

Before the PR users would not be able to override various things configured by rules_python when it comes to downloading toolchains. In the WORKSPACE world, they would be able to do this and there could be users of rules_python who might not be able to migrate to bzlmod due to the lack of such capability.

This PR refactors the internal register_all_versions tag class and adds a generic way to override how python toolchains are registered and how they are downloaded.

Fixes #1172 Work towards https://github.com/bazelbuild/rules_python/issues/2081

aignas avatar Aug 23 '24 10:08 aignas

Overall, I think its pretty promising.

All the override APIs need is_root checks, though -- letting intermediate modules override would cause quite a headache.

Did you look at some of the other use cases in https://github.com/bazelbuild/rules_python/issues/2081 ?

Some cases that seem relevant to this api are:

  • Using a local toolchain
  • Using the free-thread interpreters
  • Using one of the alternative LTO/debug builds
  • Using a musl interpreter

Though, I'm not sure if it makes sense to override entirely vs register an additional toolchain with an additional constraint. I guess overriding provides an escape hatch. Part of me thinks that it might make more sense to push those cases into "register your own toolchain, here's some helpers to make it easier for you" -- it feels like, if you're overriding things such that they no longer resemble the python-build-standalone tars that get extracted and processed, then you're probably better off not trying to override the machinery that expects such a tar file.

rickeylev avatar Aug 28 '24 04:08 rickeylev

Forgot one comment: in theory, can we use these APIs ourselves to register everything? i.e. replace versions.bzl with a series of calls in MODULE.bazel? I'm not suggesting we do that, but if we can, then that's pretty cool, and I think points to the API being in the right direction.

rickeylev avatar Aug 28 '24 04:08 rickeylev

All the override APIs need is_root checks, though -- letting intermediate modules override would cause quite a headache.

Right now everything is under mod.is_root if statement, but I think the code needs a little bit of a refactor and test coverage, so that will be improved.

Did you look at some of the other use cases in #2081 ?

Some cases that seem relevant to this api are:

  • Using a local toolchain

I haven't thought about the local toolchain yet.

  • Using the free-thread interpreters
  • Using one of the alternative LTO/debug builds
  • Using a musl interpreter

Three of the above could be modelled as target platform with different config settings:

  • free-threaded interpreters won't be available for all OS/Arches/versions initially, so asking the user to use a specific set of flag_values may be a good idea. The PLATFORMS right now is static, so actually doing that requires them to change rules_python, but I do think we could make it more ergonomic.
  • LTO/debug builds probably should have //python/config_settings:lto=[auto|enabled|disabled], debug builds - similarly? If we allow users to override the available platforms, that could be more doable I think.
  • MUSL interpreter should be handled by flag_values, this is already the case for glibc toolchain builds: https://github.com/bazelbuild/rules_python/blob/636105772e5264ed2564b627ff4d09b40532f3c3/python/versions.bzl#L529

Though, I'm not sure if it makes sense to override entirely vs register an additional toolchain with an additional constraint. I guess overriding provides an escape hatch. Part of me thinks that it might make more sense to push those cases into "register your own toolchain, here's some helpers to make it easier for you" -- it feels like, if you're overriding things such that they no longer resemble the python-build-standalone tars that get extracted and processed, then you're probably better off not trying to override the machinery that expects such a tar file.

Exactly my thoughts.

Forgot one comment: in theory, can we use these APIs ourselves to register everything? i.e. replace versions.bzl with a series of calls in MODULE.bazel? I'm not suggesting we do that, but if we can, then that's pretty cool, and I think points to the API being in the right direction.

Yeah, I think we should do that. The refactor I mentioned above is to allow for that. This idea came to me when I was looking how I was constructing and overriding the tool_versions and I think we should be able to dog food the API.

aignas avatar Aug 28 '24 08:08 aignas

So I think I got overrides working for almost everything that is in WORKSPACE world when it comes to toolchains, but I would like to have links in the docs when I do {obj}\python.override``. Do you know how to get them?

TODO for the PR to be ready for the review:

  • [ ] Tests to cover the business logic
  • [ ] Finish docs

aignas avatar Aug 30 '24 09:08 aignas

The "lookup" rules for refs is currently pretty simple. IIRC, it checks 3 places:

  • The "short names" for the current doc
  • The "short names" of everything (union of short names from all docs)
  • The fully qualified named (it might try fully qualified with and without the repo name, can't remember)

A short name is the last-component of the fully qualified name of an object. So something like a tag class arg would have a fully qualified name like @rules_python//extensions/python.bzl%python.override.someattr, and the short name would be "someattr". (note, under the hood, the full name is normalized to a dotted name, because i'm lazy and didn't like typing all the slashes in my pdb prompt)

So, python.override probably won't match anything given the above logic. override might match. The fully qualified name should work, too.

FWIW, I think this xref logic needs improvement. It was just easy to build a couple maps to check for keys than figure out what constitutes an intuitive and unsurprising resolution scheme. The logic for it lives here, if you want to try your hand at it: https://github.com/bazelbuild/rules_python/blob/main/sphinxdocs/src/sphinx_bzl/bzl.py#L1518

Maybe a more sensible algorithm would be to, given X.Y, go "up" the "scope tree" looking for "X", then go "down" the X's "object tree" looking for "Y". This is loosely how most lexical scoping works.

HTH

(Note to self: document how xref resolution works)

rickeylev avatar Aug 30 '24 15:08 rickeylev

making python.override link

https://github.com/bazelbuild/rules_python/pull/2174 should fix this.

rickeylev avatar Sep 02 '24 01:09 rickeylev

Created #2189 and #2188 so that it is easier to review this change later.

aignas avatar Sep 05 '24 13:09 aignas

This is a bit too big to easily review, will split it up.

aignas avatar Sep 07 '24 13:09 aignas