rules_python icon indicating copy to clipboard operation
rules_python copied to clipboard

feat: enable pipstar by default

Open aignas opened this issue 6 months ago • 22 comments

This is just a laundry list for the flag to be flipped by default and enable better dependency graph construction using the marker settings.

  • [X] Won't do: Add expressions_any to the env_marker_setting macro so that we can create many targets under the hood so that each FeatureFlagInfo producing target would be processing smaller expressions. At the moment we just do (expr1) or (expr2) or ... combination, which feel like it may make the tokenizing more nested and we cannot short-circuit early unless we improve the parser.
  • [x] Wait for the #2747 to be finished so that users can have levers to pull to configure the env marker evaluation when parsing requirements files.
  • [x] Enable pipstar by default in rules_python CI
  • [ ] Should do: Ask users to test pipstar in isolation before the next release.
  • [X] (optional) add an example in the documentation how to configure platform_version and similar?

Am I missing some steps?

aignas avatar Jun 01 '25 12:06 aignas

Posting here after being directed to comment if I find myself needing RULES_PYTHON_ENABLE_PIPSTAR=0.

When testing the 1.7.0 RC I am getting errors like:

===== stderr start =====
ERROR: Could not find a version that satisfies the requirement triton==3.4.0 (from versions: none)
ERROR: No matching distribution found for triton==3.4.0
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/private/var/tmp/_bazel_fp/01684213bb9c693047272ec8f0247481/external/rules_python+/python/private/pypi/whl_installer/wheel_installer.py", line 183, in <module>
    main()
  File "/private/var/tmp/_bazel_fp/01684213bb9c693047272ec8f0247481/external/rules_python+/python/private/pypi/whl_installer/wheel_installer.py", line 168, in main
    subprocess.run(pip_args, check=True, env=env)
  File "/private/var/tmp/_bazel_fp/01684213bb9c693047272ec8f0247481/external/rules_python++python+python_3_12_3_aarch64-apple-darwin/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/private/var/tmp/_bazel_fp/01684213bb9c693047272ec8f0247481/external/rules_python++python+python_3_12_3_host/python', '-m', 'pip', '--isolated', 'wheel', '--no-deps', '-r', '/var/folders/kb/_rxq15gs3gq21765gfcbbdjw0000gn/T/tmpqdw169tm']' returned non-zero exit status 1.

It is seemingly processing a Linux only dep on my local Mac build.

We use "universal" lockfiles generated via uv (via rules_python) as such:

triton==3.4.0 ; python_full_version < '3.13' and platform_machine == 'x86_64' and sys_platform == 'linux' \
    --hash=sha256:00be2964616f4c619193cb0d1b29a99bd4b001d7dc333816073f92cf2a8ccdeb \
    --hash=sha256:31c1d84a5c0ec2c0f8e8a072d7fd150cab84a9c239eaddc6706c081bfae4eb04 \
    --hash=sha256:7936b18a3499ed62059414d7df563e6c163c5e16c3773678a3ee3d417865035d \
    --hash=sha256:7b70f5e6a41e52e48cfc087436c8a28c17ff98db369447bcaff3b887a3ab4467 \
    --hash=sha256:7ff2785de9bc02f500e085420273bb5cc9c9bb767584a4aa28d6e360cec70128 \
    --hash=sha256:98e5c1442eaeabae2e2452ae765801bd53cd4ce873cab0d1bdd59a32ab2d9397
    # via torch

Disabling PIPSTAR fixes my build.

FrankPortman avatar Oct 17 '25 23:10 FrankPortman

@FrankPortman, thanks for reporting this, when do you get this error? With bazel query?

aignas avatar Oct 20 '25 13:10 aignas

@FrankPortman, thanks for reporting this, when do you get this error? With bazel query?

bazel build //... triggers it

FrankPortman avatar Oct 20 '25 15:10 FrankPortman

Before flipping the switch back I would like to fully root cause this, but I don't have a Mac, so this may be a little difficult, could you please paste the wheel version you are using for torch?

I assume it is unmodified and rules_python is also not patched right? Are you using experimental_requirement_cycles by any chance?

aignas avatar Oct 21 '25 01:10 aignas

  • Here is the torch line from my lockfile: torch==2.8.0 ; python_full_version < '3.13' \
  • We do not patch torch
  • We do patch rules_python using something similar (but not quite exactly) to https://github.com/mbland/rules_python_pip_name/blob/main/rules_python_pypi_info.patch
  • We do use experimental_requirement_cycles and torch is part of one --> "torchdeps": ["torch", "torchvision"]

FrankPortman avatar Oct 21 '25 01:10 FrankPortman

Removed usage of experimental_requirement_cycles and that didn't fix it. Will take me a little longer to test removing the patches.

FrankPortman avatar Oct 22 '25 18:10 FrankPortman

The patch that you have should not affect this (at least the one you posted above), because it is all within a rule context and not in a macro.

aignas avatar Oct 23 '25 02:10 aignas

@FrankPortman, I need your help here, #3379 is my attempt to setup something that I could debug, however, it is not there yet. Can you please comment on which parts I should modify to make the example the same?

  • python_version
  • torch version (which index are you using?)

If you manage to get a failing case, could you please post:

  • BUILD.bazel file of torch package?

aignas avatar Oct 26 '25 04:10 aignas

python.toolchain(
    configure_coverage_tool = True,
    is_default = True,
    python_version = "3.12.3",
)

# uv
uv = use_extension(
    "@rules_python//python/uv:uv.bzl",
    "uv",
    dev_dependency = True,
)
uv.configure(
    version = "0.8.17",
)
torch==2.8.0

We are using the default index - not setting that arg in any way. I think you specifically want to NOT set the "cpu" index so that you can reproduce my universal lockfile which contains linux/GPU packages where appropriate.

I will pull that branch tomorrow and play around with it a little bit to see if I can get a failing case. Do you have any basic instructions on what I should run from that branch to play with the repro? Just cd into that directory and try bazel build //...?

FrankPortman avatar Oct 26 '25 21:10 FrankPortman

@FrankPortman Could you please run bazel build with RULES_PYTHON_ENABLE_PIPSTAR=1 RULES_PYTHON_REPO_DEBUG_VERBOSITY=TRACE

That way I think I should get more diagnostic output.

aignas avatar Nov 01 '25 04:11 aignas

@aignas https://gist.github.com/FrankPortman/2dee0891a64cc1ddca90b3fabe57532d

FrankPortman avatar Nov 02 '25 02:11 FrankPortman

@aignas does the above help? anything else I can do?

FrankPortman avatar Nov 11 '25 23:11 FrankPortman

Sorry, I have little bandwidth these days to dig into this.

I checked it now and it seems that there is no information about what is happening in the module evaluation phase, so if you removed the lock file and reran it, I might get more information. I still don't have enough data to understand why the dependency is pulled.

My plan is to also add more tests on our side when I have time to ensure that we are not doing anything silly.

And I think I need more logging statements in the macro whl_library_targets evaluation.

aignas avatar Nov 12 '25 01:11 aignas

I have tried debugging this in the CI in #3379 but the builds seem to be succeeding with the env variable enabled. Am I missing something here, could you please check?

aignas avatar Nov 22 '25 08:11 aignas

I'm still taking a look but the first things I can think of are:

  • torch==2.8.0
  • My python version is 3.12.3
  • My lock call looked quite different from yours, e.g.
lock(
    name = "_update_python_deps",
    srcs = [
        ":3rdparty_python_requirements.in",
        ":pyproject.toml",
    ],
    out = "3rdparty_python_requirements.txt",
    args = ["--universal"],
)

where pyproject.toml included Python version.

But when I comment that out and add the python version to args and also specifiy python_version=3.12, it still doesn't work.

I can investigate more later but hopefully this helps narrow it down. I was also seeing something weird about @my_hub//..../pyyaml_ft missing a BUILD file during some debugging step and the only thing I can think of with that one is it is implicated in a python_version constraint. But maybe let's ignore that error for now.

FrankPortman avatar Nov 22 '25 14:11 FrankPortman

It is very hard to tell what's going on:

  1. torch==2.8.0 does not add any different dependencies.
  2. The same is even with the latest python 3.12. Is there any reason why you are still on 3.12.3 and not the latest 3.12?
  3. Same as above.

Could you please check if the example in rules_python repo works on your setup? It works in the CI. If our example works but your setup doesn't then the real problem may be something else entirely.

aignas avatar Nov 23 '25 00:11 aignas

I think I figured it out. It was failing at what seemed like the wheel fetching stage but if I removed this target:

genquery(
  name = foo,
  expression = "allpaths(bar, @pip//torch)",
  scope = ["bar", "@pip//torch"],
)

then my build //... succeeds with pipstar enabled. Basically it seems like genquery referencing torch causes it to think it needs triton, even on Mac, if pipstar is enabled. Is this intended behavior?

FrankPortman avatar Nov 23 '25 03:11 FrankPortman

Yes, because this is basically a bazel query command, it will try to fetch all of the dependencies that it has in the tree. If you had cquery instead then it would work as you expect it. Nevertheless, you probably don't want to have this long term, because if you register 2 platforms (linux and macos), then this command will force fetching of mac and linux wheels when the cross builds (#260) are enabled.

TBH, I am thinking that experimental_index_url usage would fix this, but I don't want to make this a requirement for our users onboarding onto pipstar.

I need to think about this if it is possible to fix this, maybe we'll have to enable pipstar and experimental_index_url at the same time.

aignas avatar Nov 23 '25 12:11 aignas

I'm not familiar with the experimental downloader you're talking about but if there is anything you'd like me to test, I'm happy to try. It's unfortunate that we would likely have to say goodbye to this genquery if we want to enable cross platform builds eventually since it's pretty useful for us, but I can (mostly) imagine a different way we'd accomplish the same thing with aspects if really needed.

FrankPortman avatar Nov 23 '25 13:11 FrankPortman

Ok can confirm that if I use the experimental_index_url, pointed to pypi, then my monorepo builds with pipstar enabled, and with the genquery still available. However I saw some test failures and generally have a bunch of questions on behavior/performance of experimental_index_url. Is engaging in #python channel of Slack the best place for those questions?

FrankPortman avatar Nov 23 '25 15:11 FrankPortman

Extra things that have been done as part of this feature where:

  • #3429
  • #3430

aignas avatar Nov 24 '25 06:11 aignas

@FrankPortman, created https://github.com/bazel-contrib/rules_python/pull/3441 to ensure that your genquery does not fail when enabling PIPSTAR. This should not require you to enable the experimental_index_url anymore.

aignas avatar Dec 06 '25 10:12 aignas