rules_python icon indicating copy to clipboard operation
rules_python copied to clipboard

Added arg for free threading support

Open vfdev-5 opened this issue 1 year ago • 9 comments

Description:

  • Added free_threading argument to point to correct paths of the headers and the library.

Context:

  • Related to https://github.com/google/jax/issues/23073 and building jaxlib with python 3.13 interpreter built with disabled GIL.

cc @vam-google

vfdev-5 avatar Aug 19 '24 11:08 vfdev-5

@aignas I tried to implement flag_values in toolchaines. Here is the how it is invoked from XLA now:

python_register_toolchains(
    name = "python",
    # By default assume the interpreter is on the local file system, replace
    # with proper URL if it is not the case.
    base_url = "file://",
    ignore_root_user_error = True,
    python_version = "3.13.0",
    tool_versions = {
        "3.13.0": {
            # Path to .tar.gz with Python binary. By default it points to .tgz
            # file in cache where it was built originally; replace with proper
            # file location, if you moved it somewhere else.
            "url": "/root/.cache/bazel/_bazel_root/83d0ab0b1a11dcb5adba61b76e88291b/external/python_dev/python_dev-3.13.0.tgz",
            "sha256": {
                # By default we assume Linux x86_64 architecture, eplace with
                # proper architecture if you were building on a different platform.
                "x86_64-unknown-linux-gnu": "01d224baa3ff5ddb3fc1489d8e07cbbaf17cc705652dfe0369663a658db26d49",
            },
            "strip_prefix": "python_dev-3.13.0",
            # Added when registering the toolchains
            "flag_values": {
                "@rules_python//python/config_settings:free_threading": "yes",
            },
            # Added as the suffix for python versions
            "suffix": "_ft",
        },
    },
)

Can you please review and leave a feedback whether it corresponds to your idea? I have also few questions here: https://github.com/bazelbuild/rules_python/pull/2129#discussion_r1784760822 if you can answer it would be helpful! Thanks

vfdev-5 avatar Oct 03 '24 10:10 vfdev-5

@aignas I managed to pass flag_values and suffix from python.single_version_override into python_register_toolchains:

python = use_extension("@rules_python//python/extensions:python.bzl", "python")
python.toolchain(
    configure_coverage_tool = True,
    # Only set when you have multiple toolchain versions.
    is_default = True,
    python_version = "3.13",
    ignore_root_user_error=True,
)
python.single_version_override(
    python_version = "3.13.0",
    sha256 = {
        "x86_64-unknown-linux-gnu": "a4b4e0a79b53938db050daf7f78dd09faf49d1e29dbba6cac87cf857c0a51fb4",
    },
    urls = ["20241002/cpython-{python_version}rc3+20241002-{platform}-{build}.tar.gz"],
    # Added when registering the toolchains
    flag_values = {
        "@rules_python//python/config_settings:free_threading": "yes",
    },
    # Added as the suffix for python versions
    suffix = "_ft",
)

However, there is an issue with the label processing in this code:

        free_threading_label = str(Label("//python/config_settings:free_threading"))
        free_threading = False
        if flag_values:
            free_threading = flag_values.get(free_threading_label, False) == "yes"

Using WORKSPACE, the value of free_threading_label is "@rules_python//python/config_settings:free_threading" but using bzlmod and python.single_version_override, the value of free_threading_label is "@@rules_python~override//python/config_settings:free_threading". Do you have a suggestion on how to handle properly all cases?

vfdev-5 avatar Oct 07 '24 12:10 vfdev-5

Regarding the flag value handling, you could

  1. Normalize the dictionary so that you are sure that the keys are strings.
  2. Set the default of free_threading value to "no" so that in bzlmod when the user selects free threading to yes, only the special versions are matched.

I think that could work.

On 7 October 2024 21:54:22 GMT+09:00, vfdev @.***> wrote:

@aignas I managed to pass flag_values and suffix from python.single_version_override into python_register_toolchains:

python = ***@***.***_python//python/extensions:python.bzl", "python")
python.toolchain(
   configure_coverage_tool = True,
   # Only set when you have multiple toolchain versions.
   is_default = True,
   python_version = "3.13",
   ignore_root_user_error=True,
)
python.single_version_override(
   python_version = "3.13.0",
   sha256 = {
       "x86_64-unknown-linux-gnu": "a4b4e0a79b53938db050daf7f78dd09faf49d1e29dbba6cac87cf857c0a51fb4",
   },
   urls = ["20241002/cpython-{python_version}rc3+20241002-{platform}-{build}.tar.gz"],
   # Added when registering the toolchains
   flag_values = {
       ***@***.***_python//python/config_settings:free_threading": "yes",
   },
   # Added as the suffix for python versions
   suffix = "_ft",
)

However, there is an issue with the label processing in this code:

       free_threading_label = str(Label("//python/config_settings:free_threading"))
       free_threading = False
       if flag_values:
           free_threading = flag_values.get(free_threading_label, False) == "yes"

Using WORKSPACE, the value of free_threading_label is @.***_python//python/config_settings:free_threading"but using bzlmod andpython.single_version_override, the value of free_threading_labelis"@@rules_python~override//python/config_settings:free_threading"`. Do you have a suggestion on how to handle properly all cases?

-- Reply to this email directly or view it on GitHub: https://github.com/bazelbuild/rules_python/pull/2129#issuecomment-2396850260 You are receiving this because you were mentioned.

Message ID: @.***>

aignas avatar Oct 07 '24 13:10 aignas

  1. Normalize the dictionary so that you are sure that the keys are strings.

I tried to apply that here: https://github.com/bazelbuild/rules_python/pull/2129/files#diff-588f665d3e43008635e703e7601e5c5e295979b1a2f46566c6d8b12cee45adb3R362

  1. Set the default of free_threading value to "no" so that in bzlmod when the user selects free threading to yes, only the special versions are matched.

I'm not sure how to do that exactly. I was thinking to add something like this

# python_register_toolchains.bzl

def python_register_toolchains(
        name,
        python_version,
        register_toolchains = True,
        register_coverage_tool = False,
        set_python_version_constraint = False,
        tool_versions = None,
        minor_mapping = None,
        **kwargs):

    ...
    if flag_values:
         free_threading = flag_values.get(free_threading_label, False) == "yes"

    if free_threading and python_version < "3.13.0":
        fail("We can't specify free_threading: yes with python < 3.13.0")

but I do not know how to properly compare python versions in this language.

Can you also suggest where exactly to hook suffix in python_register_toolchains ?

vfdev-5 avatar Oct 07 '24 13:10 vfdev-5

Thanks for the suggestions @aignas ! Concerning the point 1, free-threading option is an option of CPython build, like we have build set as install_only (for free-threading releases I think we can use build version freethreaded+pgo-full). We also will have other sha256 values. I think adding free-threading option to PLATFORMS may not be the best way...

Concerning the point 3

If we used PLATFORMS for all of the flag_value passing around, the code changes for python_register_toolchains would be minimal. The coverage may need extra wiring, but that is about it.

We would like to have platform name reflecting flag value ? Let's say we have x86_64-unknown-linux-gnu platform name and flag values for free-threading yes/no: x86_64-unknown-linux-gnu and x86_64-unknown-linux-gnu-freethreading ?

Adding and handling keys like ("3.13.0", True) or maybe a "3.13.0-free-threading" to TOOL_VERSIONS seems to be a bit more appropriate. What do you think if we would just have a special handling for python versions like "3.XY.Z-free-threading" ?

Thinking out loud, it seems like our problem could be solved if we could specify build option even for other versions of python e.g. they have options like pgo-full, pgo+lto-full etc

vfdev-5 avatar Oct 09 '24 14:10 vfdev-5

Regarding platform names, yes we would have to add free-threading suffix to the platform name.

Regarding the version, Hmm... Yeah, adding something special at the end of the version crossed my mind as well - we could follow the upstream and use 't' as the suffix in that case. There are other suffixes (e.g. m) that have been historically used in the ABI definition in the past. However, some code expects the version to be semver compatible. That is why I would be in favour of using separate platforms.

The url in the TOOL_VERSIONS can be a dict by platform name. So that would fit nicely in the existing workspace model. Do you see any other difficulties in using platforms for everything?

On 9 October 2024 23:59:07 GMT+09:00, vfdev @.***> wrote:

Thanks for the suggestions @aignas ! Concerning the point 1, free-threading option is an option of CPython build, like we have build set as install_only (for free-threading releases I think we can use build version freethreaded+pgo-full). We also will have other sha256 values. I think adding free-threading option to PLATFORMS may not be the best way...

Concerning the point 3

If we used PLATFORMS for all of the flag_value passing around, the code changes for python_register_toolchains would be minimal. The coverage may need extra wiring, but that is about it.

We would like to have platform name reflecting flag value ? Let's say we have x86_64-unknown-linux-gnu platform name and flag values for free-threading yes/no: x86_64-unknown-linux-gnu and x86_64-unknown-linux-gnu-freethreading ?

Adding and handling keys like ("3.13.0", True) or maybe a "3.13.0-free-threading" to TOOL_VERSIONS seems to be a bit more appropriate. What do you think if we would just have a special handling for python versions like "3.XY.Z-free-threading" ?

-- Reply to this email directly or view it on GitHub: https://github.com/bazelbuild/rules_python/pull/2129#issuecomment-2402583920 You are receiving this because you were mentioned.

Message ID: @.***>

aignas avatar Oct 09 '24 23:10 aignas

Thanks for the details, @aignas !

I think there are few points to discuss:

  1. how to extend python_register_toolchains to be able to have tool_versions overridden such that we could specify free-threading build option.
  2. how user could select python with free-threading without overriding tool_versions.

Concerning the point 1: I would like to explore whether we can add build_option arg into tool_versions variable:

python_register_toolchains(
    name = "python",
    # By default assume the interpreter is on the local file system, replace
    # with proper URL if it is not the case.
    ignore_root_user_error = True,
    python_version = "3.13",
    tool_versions = {
        "3.13.0": {
            # NOTE THAT HERE WE HAVE .zst ARCHIVE AND {build} VALUE SHOULD BE freethreaded+pgo-full FOR EXAMPLE TO GET no-gil RELEASE
            "url": "20241008/cpython-{python_version}+20241008-{platform}-{build}.tar.zst",
            "sha256": {
                "x86_64-unknown-linux-gnu": "f36f79adfcbdbe13261ba364da4f06c6f3a081702fd4f155b63fe924ff1ee9a2",
            },
            "build_option": "freethreaded+pgo-full"    # this should be a flag ideally
        },
    },
)

this way. we can correctly fill {build} value in the url when rendering it:

# python_register_toolchains
        build_option = tool_versions[python_version].get("build_option", None)
        ...
        (release_filename, urls, strip_prefix, patches, patch_strip, free_threading) = get_release_info(
            platform, python_version, base_url, tool_versions, build_option=build_option
        )
        ...
        build_opt_str = ("_" + build_option.replace("+", "-")) if build_option else ""
        ...
        python_repository(
            name = "{name}_{platform}{build_opt}".format(
                name = name,
                platform = platform,
                build_opt = build_opt_str
            ),
        ...

and

def get_release_info(
    platform, python_version, base_url = DEFAULT_RELEASE_BASE_URL, tool_versions = TOOL_VERSIONS, build_option = None
):  
    ...
    url = tool_versions[python_version]["url"]
    if not build_option:
        build_option = "shared-install_only" if (WINDOWS_NAME in platform) else "install_only"
        free_threading = False
    else:
        # As build_option should be a flag we would check whether it is free_threading differently
        free_threading = True if build_option.startswith("freethreaded") else False
   ...
    for u in url:
        release_filename = u.format(
            platform = platform,
            python_version = python_version,
            build = build_option,
        )
        ...
    ...
    return (release_filename, rendered_urls, strip_prefix, patches, patch_strip, free_threading)

However, there is something hard-coded inside _pip_repository_impl implementation searching for the repository exactly as @python_x86_64-unknown-linux-gnu...

Here is a draft implementation: https://github.com/bazelbuild/rules_python/pull/2129/commits/be242fa34992788cded66198bbe6230ae15516de , the issue with pip searching @python_x86_64-unknown-linux-gnu is fixed now. What do you think ?

Concerning the point 2 "how user could select python with free-threading without overriding tool_versions.". We can maybe do something like this:

MINOR_MAPPING = {
    ...
    "3.13": "3.13.0",
    "3.13t": ("3.13.0", "freethreaded+pgo-full"),   (version, build_option)
}

TOOL_VERSIONS = {
    ...
    "3.13.0": {
        "url": "20241008/cpython-{python_version}+20241008-{platform}-{build}.tar.gz",
        "sha256": {
            "aarch64-apple-darwin": "5d3cb8d7ca4cfbbe7ae1f118f26be112ee417d982fab8c6d85cfd8ccccf70718",
            "aarch64-unknown-linux-gnu": "c1142af8f2c85923d2ba8201a35b913bb903a5d15f052c38bbecf2f49e2342dc",
            "ppc64le-unknown-linux-gnu": "1be64a330499fed4e1f864b97eef5445b0e4abc0559ae45df3108981800cf998",
            "s390x-unknown-linux-gnu": "c0b1cc51426feadaa932fdd9afd9a9af789916e128e48ac8909f9a269bbbd749",
            "x86_64-apple-darwin": "b58ca12d9ae14bbd79f9e5cf4b748211ff1953e59abeac63b0f4e8e49845669f",
            "x86_64-pc-windows-msvc": "c7651a7a575104f47c808902b020168057f3ad80f277e54cecfaf79a9ff50e22",
            "x86_64-unknown-linux-gnu": "455200e1a202e9d9ef4b630c04af701c0a91dcaa6462022efc76893fc762ec95",
        },
        "strip_prefix": "python",
   },
   ("3.13.0", "freethreaded+pgo-full"): {
       "url": "20241008/cpython-{python_version}+20241008-{platform}-{build}.tar.zst",
       "sha256": {
                "x86_64-unknown-linux-gnu": "f36f79adfcbdbe13261ba364da4f06c6f3a081702fd4f155b63fe924ff1ee9a2",
                ...
        },
        "strip_prefix": "python",
   }

Maybe, we can skip this feature for now and just enable point 1.

The url in the TOOL_VERSIONS can be a dict by platform name. So that would fit nicely in the existing workspace model. Do you see any other difficulties in using platforms for everything?

Adding cpython build option to the platform name seems a bit like a hack to me which may work. I do not know whether we'll have a similar issue with _pip_repository_impl which would look for exactly @python_x86_64-unknown-linux-gnu. I saw some code in python/private/pypi doing certain platform speculations, so appending build option to platform may break certain assumption there ?

vfdev-5 avatar Oct 10 '24 09:10 vfdev-5

FYI, I did a few experiments in https://github.com/aignas/rules_python/tree/exp/freethreading and I think that going with defining extra PLATFORM values will make it the easiest. the things that are remaining:

  • PyPI integration needs to handle cp313t wheels.
  • We should have tests that the config settings work as intended.

I don't have time to finish it, so if you are interested feel free to cherry pick code from my branch as you wish.

aignas avatar Oct 14 '24 02:10 aignas

FYI, I did a few experiments in https://github.com/aignas/rules_python/tree/exp/freethreading and I think that going with defining extra PLATFORM values will make it the easiest. the things that are remaining:

* PyPI integration needs to handle cp313t wheels.

* We should have tests that the config settings work as intended.

I don't have time to finish it, so if you are interested feel free to cherry pick code from my branch as you wish.

I'm trying to build xla with python 3.13 freethreading (it is my local test) using your branch and specifying:

python_register_toolchains(
    name = "python",
    ignore_root_user_error = True,
    python_version = "3.13",
    tool_versions = {
        "3.13.0": {
            "url": "20241008/cpython-{python_version}+20241008-x86_64-unknown-linux-gnu-freethreaded+pgo+lto-full.tar.zst",
            "sha256": {
                "x86_64-unknown-linux-gnu-freethreaded": "00a159a64640ce614bdac064b270a9854d86d40d1d8387a822daf1fe0881e64b"
            },
        },
    },
)

and it complains about missing definition:

ERROR: Failed to load Starlark extension '@pypi//:requirements.bzl'.
Cycle in the workspace file detected. This indicates that a repository is used prior to being defined.
The following chain of repository dependencies lead to the missing definition.
 - @pypi
 - @python_x86_64-unknown-linux-gnu
This could either mean you have to add the '@python_x86_64-unknown-linux-gnu' repository with a statement like `http_archive` in your WORKSPACE file (note that transitive dependencies are not added automatically), or move an existing definition earlier in your WORKSPACE file.
INFO: Repository pypi instantiated at:
  /tmp/jax/xla/WORKSPACE:61:16: in <toplevel>
  /tmp/jax/xla/third_party/py/python_init_pip.bzl:29:14: in python_init_pip
Repository rule pip_repository defined at:
  /root/.cache/bazel/_bazel_root/83d0ab0b1a11dcb5adba61b76e88291b/external/rules_python/python/private/pypi/pip_repository.bzl:222:33: in <toplevel>
ERROR: Error computing the main repository mapping: cycles detected during computation of main repo mapping

I had that previously when integrated build_option parameter. I can try to figure out if we can make it work but I feel that there are places where platform value is got using some heuristics and which will lack the suffix we are adding...

On the other hand, can you check this commit: https://github.com/bazelbuild/rules_python/commit/be242fa34992788cded66198bbe6230ae15516de with build_option. Using this code, I can build xla for cpython 3.13 freethreading

vfdev-5 avatar Oct 14 '24 09:10 vfdev-5

@aignas can you check above message please ?

vfdev-5 avatar Oct 22 '24 08:10 vfdev-5

I finally had some time to look into this and finished my previous POC, could you please test https://github.com/bazelbuild/rules_python/pull/2372?

aignas avatar Nov 05 '24 00:11 aignas