rules_python icon indicating copy to clipboard operation
rules_python copied to clipboard

Some questions around `experimental_index_url` behavior

Open FrankPortman opened this issue 1 month ago • 7 comments

Opening a ticket to dump in all questions for triage, per suggestion of Slack message.

In this thread https://github.com/bazel-contrib/rules_python/issues/2949#issuecomment-3448011820 it was suggested that experimental_index_url would fix any issues I was seeing with RULES_PYTHON_ENABLE_PIPSTAR=1. That turned out to be true, but I saw a few things that I wanted clarification on.

Setup

We use a "universal" lock-file generated via uv that looks like this:

triton==3.4.0 ; python_full_version < '3.13' and platform_machine == 'x86_64' and sys_platform == 'linux' \
    ...

and then in pip.parse we don't specify requirements_by_platform as a result:

pip.parse(
    enable_implicit_namespace_pkgs = True,
    experimental_requirement_cycles = { ... bunch of cycles ... },
    hub_name = "core_py_deps",
    python_version = "3.12.3",
    requirements_lock = "//:3rdparty_python_requirements.txt",
)

Questions/Concerns from the switch

Switching allowed my genquery based tests to work alongside pipstar, but I was hoping to get clarification on the following:

  • One of my tests had this line r.Rlocation("rules_python++pip+core_py_deps_312_attrs/attrs-25.4.0-py3-none-any.whl") which I had to switch to r.Rlocation("rules_python++pip+core_py_deps_312_attrs_py3_none_any_adcf7e2a/attrs-25.4.0-py3-none-any.whl"). I found the new path by manually cquerying in CLI and am not sure why this is happening or what the idiomatic path forward is. Interestingly this new path works on both OSX and Linux.
  • On the branch with experimental_index_url, I noticed it was downloading all the linux-only torch/cuda wheels. I don't think this happens normally, but I am not an expert. The build still works on Mac, but this is still a little concerning since those wheels are massive. Am I thinking about this the wrong way? I can confirm that in a different branch output_base, the external folder doesn't have non OSX wheels. But in the experimental_index_url branch, it does.
  • bazel query deps(...) now fails because it is trying to install a Windows-only package (pywinpty). That package is in my universal lockfile with os_name == 'nt'

FrankPortman avatar Nov 25 '25 17:11 FrankPortman

  • This path is an implementation detail and it would be best to use the usual bazel methods to get the rlocationpath of the wheel path. E.g. in your BUILD file rlocationpath and pass the path via env var.
  • This is working as intended in a sense that genquery executing the query will fetch all of the wheels. If you are not interested in cross builds and only care about the host platform execution, then I could think how to accommodate this. Let me know how important this is. Since by default rules_python attempts to support all of the default platforms, it is best to use requirements_by_platform to avoid fetching wheels for platforms you don't care about.
  • The third item should be fixed on main.

In general if you don't need to cross build sdists, the new experimental code paths should work well and speed up builds in general.

aignas avatar Nov 26 '25 00:11 aignas

TY for the quick reply.

  • Heard re using rlocationpath/env-vars. We used to do that more but found some friction with that when using things like multirun or building for non-Bazel execution environments. Since this example I linked was from a very very specific test, I am not too worried about it. I did also see that the prefix adcf7e2a was based on the hash of the wheel that it chose for attrs.
  • Sorry - could you clarify. I am following up with my own tests, but are you saying one or all of:
    • my genquery (which references torch) is causing it to download (but correctly not use) non OSX wheels
    • just using experimental_index_url is causing it to download (but correctly not use) non OSX wheels
    • using experimental_index_url without requirements_by_platform is causing it to download (but correctly not use) non OSX wheels
      • If this one is a problem - would naively feeding in the same universal lockfile for each platform fix things? It has all the sys_platform markers
    • For any of the options above - is there an interaction with pipstar being enabled/disabled?
  • Awesome!

Once I wrap my head around things, I am happy to contribute back some documentation if you feel that this discussion touches on some underdocumented components.

Update: Okay it does seem if I comment out genquery usage, then it doesn't pull the wheels for other platforms. You asked how important this is - I would say it's relatively important for us since we essentially have some tests that say "torch should not end up in the transitive dep closure of these targets", and it is very nice to run those as part of standard Bazel tests, and not sidecar CI queries. If it can't be supported, it would be nice if it was more apparent that this was quintupling the size of the artifacts we need to pull, but maybe there is no easy way to show that. If there is something I can ask the Bazel folks about w.r.t. genquery behavior, or future enhancements, let me know if the most elegant solution is closer to that.

FrankPortman avatar Nov 26 '25 14:11 FrankPortman

So to repeat what the feature request would be:

As a bazel user I want to blacklist certain deps in a hub repo from ever being part of a target without needing to download unnecessary information.

Right now you have a genquery to help you out, but the problem is that this is downloading all of the dependencies. If we are able to do this by just inspecting the dist-info or something similar, it would be much better. As you mentioned, this can be done using a custom aspect that you run as part of your build. The question is if this aspect should be a part of rules_python API that we provide to our users.

aignas avatar Nov 27 '25 01:11 aignas

That would be great, but I recognize that it feels very specific of a use case, especially based on how you phrased it. So maybe this is just on me to create an aspect for my codebase. I'm not sure if it's realistic that anything in rules_python gives more warning that it is downloading so many external deps, but that would be nice to not inadvertently blow up a CI machine without expecting it.

Would switching to non-universal lockfiles and specifying requirements_by_platform get me the same behavior I want for simple genquery? Because I am open to that - we switched to universal lockfiles because uv could generate them, and I was tired of manually wrangling pip and requirements_by_platform from before, but iiuc uv can also simply generate per-platform lockfiles in a nicer way than pip.

FrankPortman avatar Nov 28 '25 14:11 FrankPortman

As for platform specific vs universal lock files, no not really the main question here is if you want the feature described in #260 to be always on or only opt-in.

I'll leave this ticket open so that we can revisit this and discuss later.

aignas avatar Nov 29 '25 09:11 aignas

As you mentioned, this can be done using a custom aspect that you run as part of your build. The question is if this aspect should be a part of rules_python API that we provide to our users.

I went ahead and wrote this aspect internally for the specific use case we had (a test target that proved A doesn't depend on B), and removed all genquery usage from my codebase for now.

As for platform specific vs universal lock files, no not really the main question here is if you want the feature described in #260 to be always on or only opt-in.

If I feed in the universal lockfile to requirements_darwin and requirements_linux then actually build //... breaks :/. So that is out for now.

I'll leave this ticket open so that we can revisit this and discuss later.

Sounds good - thinking about my current state where I no longer use genquery, but where I am still kind of concerned about adding an innocuous change to my repo that may suddenly pull in 10s of gigabytes of extra wheels, maybe the easiest feature would be some sort of setting where cross-platform actions aren't supported, and doing something that would require opt-in to that feature, would result in a (1) build/analysis failure or better yet (2) "just works" a la one of the above things we discussed. That way I can have more control of if/when to opt in. Not sure if I described it well, since I have forgotten some context since I first started poking around in this ticket and https://github.com/bazel-contrib/rules_python/issues/2949#issuecomment-3448011820.

FrankPortman avatar Dec 02 '25 20:12 FrankPortman

Yeah, I think my goal would be to change the experimental_index_url codepath to have a notion of target_platforms that are actually requested by a particular hub and default to host. I did some experiments with it and it sounds promising, but it is too early to say anything yet.

aignas avatar Dec 02 '25 23:12 aignas

OK, since the target_platforms attribute has been introduced and documented, I am marking this as done.

aignas avatar Dec 14 '25 12:12 aignas

Thanks so much @aignas. I have one more quick question, just to put on a bow on it, and so I fully understand and can explain it back to my team.

the new experimental code paths should work well and speed up builds in general.

Outside of supporting cross-platform builds, what is the main speedup/benefit of pipstar and experimental_index_url. For the latter, iiuc, one major pro is using the Bazel downloader instead of fetching wheels directly since that is {faster?, more cacheable?}. I think there is a lot of info scattered inside the changelog, Bazel Slack, RTD, and so on, but would love an answer here while all the context is in one place. I am happy to contribute back documentation if you think there is merit to that.

FrankPortman avatar Dec 15 '25 16:12 FrankPortman

I think we should have a single page for it TBH, so lets reopen the issue and close it once we document.

In short:

  • We use the bazel downloader to download the wheels, which results in faster download times and we can use bazel's cache to not download the same thing twice.
  • For wheel-only setups, this means that we can also setup cross-builds and ensure that the dependencies are always correct.
  • We no longer use a Python interpreter to interpret the whl METADATA file. We instead parse the METADATA file in Starlark and evaluate the RequiresDist during the bazel build instead of during fetching the dependencies. This means that we no longer need to extract the same wheel multiple times if we are using for multiple platforms.
  • Also we no longer need to fetch the Python Interpreter to start fetching the whl dependencies. This also means that when the Python interpreter version is upgraded, we no longer need to refetch all of the wheels because of the cache invalidation.
  • No pip is involved in downloading the wheels.

aignas avatar Dec 15 '25 23:12 aignas