buck2 icon indicating copy to clipboard operation
buck2 copied to clipboard

Disable cell-segmentation of import paths with buckconfig

Open cormacrelf opened this issue 2 months ago • 6 comments

Fixes https://github.com/facebook/buck2/issues/683 if you enable the config.

Also fixes https://github.com/facebook/buck2/issues/890.

This is achieved by ignoring the build_file_cell during ImportPath construction. I can confirm this fixes the test case on that issue (https://github.com/cormacrelf/buck2-tset-bug).

This is a big improvement for users outside of Meta because previously it was practically impossible to use transitive sets. I have been writing some rules and have had to emulate the feature with regular starlark sets.

Previous version of this PR had cfg flags; now I made a buckconfig for it. Because it does break some things.

This will break any code that reads read_config / get_cell_name at the top level of a BZL file outside of the prelude. It will also break code that uses read_config / get_cell_name during analysis (which should never have worked). It is therefore not activated for anyone by default. However it is not super difficult to migrate. To test it, you can use

# root .buckconfig
[buck2]
    disable_cell_segmentation = true

Breakage

Every user, with or without the buckconfig

read_config during analysis is no longer accepted

Previously this was unofficially and perhaps accidentally supported:

def _impl(ctx: AnalysisContext) -> list[Provider]:
    xyz = read_config("abc", "xyz")
    ...

rule = rule(impl = _impl)

The reason being that read_config was marked as #[starlark(speculative_exec_safe)], so it was actually being inlined when the file was initially parsed. It can no longer be inlined so this no longer works. I recall this working sometimes but not always, to my great puzzlement, which is why I expect people won't have used it that much. There will nevertheless be code that does rely on it. It now fails with

error: This function is unavailable during analysis (usual solution is to place the information on a toolchain)
BUILD FAILED
Error running analysis for `root//rust:test (prelude//platforms:default#7171795d350c5b11)`

Caused by:
    Traceback (most recent call last):
      File <builtin>, in <module>
      * prelude/rust/rust_binary.bzl:519, in rust_test_impl
          re_executor, executor_overrides = get_re_executors_from_props(ctx)
      * prelude/tests/re_utils.bzl:68, in get_re_executors_from_props
          re_arg = _get_re_arg(ctx)
      * prelude/tests/re_utils.bzl:20, in _get_re_arg
          force_local = read_config("fbcode", "disable_re_tests", default = False)

    error: This function is unavailable during analysis (usual solution is to place the information on a toolchain)
      --> prelude/tests/re_utils.bzl:20:19
       |
    20 |     force_local = read_config("fbcode", "disable_re_tests", default = False)
       |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       |

To migrate code that relied on starlark inlining of read_config

Simply lift read_config calls to constants outside of analysis impls. I have already done this for a few stragglers in the prelude.

+_XYZ = read_config("abc", "xyz")
+
def _impl(ctx: AnalysisContext) -> list[Provider]:
-    xyz = read_config("abc", "xyz")
-    if xyz:
+    if XYZ:

abc_library = rule(impl = _impl)

Breaking changes WITH the buckconfig

Before (non-prelude cell)

# othercell//:defs.bzl

# all calls relate to the build file cell and will agree.
STATIC_CONFIG = read_config(...)
STATIC_CELL = native.get_cell_name()
abc = rule(
    ...
    attrs = { "xyz": attrs.string(default = read_config(...)), },
)
def macro():
    _ = read_config(...)
    _ = native.get_cell_name()
# BUCK file

load("@othercell//:defs.bzl", "macro")
macro()

After

There is now a difference between two kinds of calls to get_cell_name/read_config. It matters when you evaluate them.

# othercell//:defs.bzl

# These now relate to the .bzl file's cell
STATIC_CONFIG = read_config(...)
STATIC_CELL = native.get_cell_name()
abc = rule(
    ...
    attrs = { "xyz": attrs.string(default = read_config(...)), },
)

def macro():
    # these two relate to build file cell as before
    _ = read_config(...)
    _ = native.get_cell_name()

# BUCK file

load("@othercell//:defs.bzl", "macro")
macro()

For the prelude specifically

No change, the "after" is already how it works in the prelude.

How to migrate non-prelude cells

Before buck2.disable_cell_segmentation, all read_config/get_cell_name calls referred to the importing build file's cell. Achieving this is now a bit more difficult. You must now call read_config in a macro such that it is evaluated in the build file context. This may require wrapping a rule in a macro.

The ONLY way to achieve 'read the importing build file cell config' is to put it in a macro and call that macro during build file evaluation. Notably, attrs.string(default = read_config(...)) is now evaluated in the cell owning the .bzl file since that is eagerly evaluated and passed to rule().

def _impl(ctx: AnalysisContext) -> list[Provider]:
-    xyz = read_config("abc", "xyz")
-    if xyz:
+    if ctx.attrs.xyz:

-abc_library = rule(impl = _impl)
+abc_library = rule(impl = _impl, attrs = { "xyz": attrs.string() })
+_abc_library = abc_librneary
+
+def abc_library(**kwargs):
+    kwargs.setdefault("xyz", read_config("abc", "xyz"))
+    _abc_library(**kwargs)

I believe this pattern is already necessary to read package values and populate defaults with them.

Alternatively, use a toolchain rule and allow your users to configure things there instead of in buckconfig.

Review questions

  • ~~Is parser.cell_segmentation the best buckconfig name? Almost every other config is in the buck2 section but this is one of those rare occasions that maybe parser is right? Feel free to change it in app/buck2_interpreter/src/import_paths.rs. You can call the config whatever you like, just fix the tests, or I am happy to change it. I suppose it doesn't matter since it's a temporary thing anyway.~~ Ok I changed it to buck2.disable_cell_segmentation.
  • I don't know how to run the integration test that I wrote. Fingers crossed it works.
  • There may be a need for disabling this in a more fine-grained way. Especially if people are using a bunch of external cells and they have not been updated to support this new mode.
  • Do we maybe want to read the buckconfig not from the root, but rather from the bzl's cell? That way a cell can mark itself safe for single-eval. Which it will do if it contains transitive set definitions, for example. This would slow down a global migration away from cell segmentation by reducing incentive to change code to work without it.

cormacrelf avatar Sep 23 '25 06:09 cormacrelf

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this in D83033385. (Because this pull request was imported automatically, there will not be any future comments.)

facebook-github-bot avatar Sep 23 '25 06:09 facebook-github-bot

Test buck2_interpreter/ package_imports::tests::test fixed, I think windows was failing for unrelated reasons.

cormacrelf avatar Sep 23 '25 08:09 cormacrelf

Also, if you want to turn this into a runtime config flag so you can test and migrate behaviour internally that's fine, feel free to do so. You can make a very similar change that simply overrides build_file_cell = path.cell() at ImportPath construction time, which works just as well and can then be made configurable.

cormacrelf avatar Sep 23 '25 08:09 cormacrelf

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this in D83033385. (Because this pull request was imported automatically, there will not be any future comments.)

meta-codesync[bot] avatar Oct 24 '25 13:10 meta-codesync[bot]

Draft because basically this breaks some usages of get_cell_name being called in a .bzl file.

# one//:bfc.bzl

# returns one, only evaluated once
ACTIVE_CELL = native.get_cell_name()

def macro():
    # works
    return native.get_cell_name()

# two//BUCK
load("one//:bfc.bzl", "macro", "ACTIVE_CELL")

# if you import ACTIVE_CELL, you get "one".
# but if you call the macro, it returns "two"
# works
macro()

So we just have to fix some usages like ACTIVE_CELL to be like macro().

cormacrelf avatar Oct 27 '25 03:10 cormacrelf

Ok, this is better.

Before (non-prelude cell)

# all four calls relate to the build file cell and will all agree.
STATIC_CONFIG = read_config(...)
STATIC_CELL = native.get_cell_name()
def macro():
    _ = read_config(...)
    _ = native.get_cell_name()

After

# These two relate to the .bzl file's cell
STATIC_CONFIG = read_config(...)
STATIC_CELL = native.get_cell_name()

def macro():
    # these two relate to build file cell as before
    _ = read_config(...)
    _ = native.get_cell_name()

For the prelude specifically

No change, the "after" is already how it works in the prelude.

read_config during analysis is no longer accepted

Previously this was unofficially and perhaps accidentally supported:

def _impl(ctx: AnalysisContext) -> list[Provider]:
    xyz = read_config("abc", "xyz")
    ...

rule = rule(impl = _impl)

The reason being that read_config was #[starlark(speculative_exec_safe)], so it was actually being inlined when the entire file was parsed. It can no longer be inlined so this no longer works. It fails with

BUILD FAILED
Error running analysis for `root//rust:test (prelude//platforms:default#7171795d350c5b11)`

Caused by:
    Traceback (most recent call last):
      File <builtin>, in <module>
      * prelude/rust/rust_binary.bzl:519, in rust_test_impl
          re_executor, executor_overrides = get_re_executors_from_props(ctx)
      * prelude/tests/re_utils.bzl:68, in get_re_executors_from_props
          re_arg = _get_re_arg(ctx)
      * prelude/tests/re_utils.bzl:20, in _get_re_arg
          force_local = read_config("fbcode", "disable_re_tests", default = False)

    error: This function is unavailable during analysis (usual solution is to place the information on a toolchain)
      --> prelude/tests/re_utils.bzl:20:19
       |
    20 |     force_local = read_config("fbcode", "disable_re_tests", default = False)
       |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       |

cormacrelf avatar Oct 27 '25 09:10 cormacrelf