pants icon indicating copy to clipboard operation
pants copied to clipboard

`pants help-all` isn't deterministic

Open huonw opened this issue 1 year ago • 2 comments

Describe the bug

The output of pants help-all appears to not be deterministic, when run with different pantsd instances (it is deterministic if run with the same pantsd running).

Determinstism is good in general, but this specifically leads to spurious diffs in the documentation sync PRs, which check in the output of pants help-all to the https://github.com/pantsbuild/pantsbuild.org repo (via the sync_docs.yml workflow).

I think this may be (partly) related to the use of sets in the code below (and potentially other places in that file):

https://github.com/pantsbuild/pants/blob/8481389512cc1563ef08077838503769d8e4a4ae/src/python/pants/engine/target.py#L441-L452

These potentially lead to non-deterministic order in class_field_types in that file, which is what is called to fill out a target's field in TargetTypeHelpInfo in help_info_extractor.py.

Reproducer:

cd $(mktemp -d)

cat > pants.toml <<EOF
[GLOBAL]
pants_version = "2.19.0"
EOF

pants --no-pantsd help-all > first.json
pants --no-pantsd help-all > second.json

diff -U3 first.json second.json

Running that gives output like (truncated):

--- first.json	2024-02-27 12:28:23.000000000 +1100
+++ second.json	2024-02-27 12:28:26.000000000 +1100
@@ -13895,8 +13895,12 @@
       "used_in_rules": []
     },
     "pants.core.goals.lint.Batch": {
-      "consumed_by_rules": [],
-      "dependencies": [],
+      "consumed_by_rules": [
+        "pants.backend.project_info.regex_lint.lint_with_regex_patterns"
+      ],
+      "dependencies": [
+        "pants.core"
+      ],
       "dependents": [
         "pants.backend.project_info"
       ],
@@ -13904,15 +13908,11 @@
       "is_union": true,
       "module": "pants.core.goals.lint",
       "name": "Batch",
-      "provider": "pants.core",
+      "provider": "pants.backend.project_info",
       "returned_by_rules": [],
-      "union_members": [
-        "Batch"
-      ],
-      "union_type": null,
-      "used_in_rules": [
-        "pants.core.goals.lint.lint"
-      ]
+      "union_members": [],
+      "union_type": "Batch",
+      "used_in_rules": []
     },
     "pants.core.goals.lint.Lint": {
       "consumed_by_rules": [],
@@ -21164,25 +21164,25 @@
           "type_hint": "str | None"
         },
         {
-          "alias": "subprocess_environment_env_vars",
+          "alias": "pex_executable_search_paths",
           "default": null,
-          "description": "Overrides the default value from the option `[subprocess-environment].env_vars` when this environment target is active.",
+          "description": "Overrides the default value from the option `[pex].executable_search_paths` when this environment target is active.",
           "provider": "pants.core",
           "required": false,
           "type_hint": "Iterable[str] | None"
         },
         {
-          "alias": "python_native_code_cpp_flags",
+          "alias": "subprocess_environment_env_vars",
           "default": null,
-          "description": "Overrides the default value from the option `[python-native-code].cpp_flags` when this environment target is active.",
+          "description": "Overrides the default value from the option `[subprocess-environment].env_vars` when this environment target is active.",
           "provider": "pants.core",
           "required": false,
           "type_hint": "Iterable[str] | None"
         },
         {
-          "alias": "pex_executable_search_paths",
+          "alias": "system_binaries_system_binary_paths",
           "default": null,
-          "description": "Overrides the default value from the option `[pex].executable_search_paths` when this environment target is active.",
+          "description": "Overrides the default value from the option `[system-binaries].system_binary_paths` when this environment target is active.",
           "provider": "pants.core",
           "required": false,
           "type_hint": "Iterable[str] | None"
@@ -21196,33 +21196,33 @@
           "type_hint": "Iterable[str] | None"
         },
         {
-          "alias": "test_extra_env_vars",
+          "alias": "python_native_code_ld_flags",
           "default": null,
-          "description": "Overrides the default value from the option `[test].extra_env_vars` when this environment target is active.",
+          "description": "Overrides the default value from the option `[python-native-code].ld_flags` when this environment target is active.",
           "provider": "pants.core",
           "required": false,
           "type_hint": "Iterable[str] | None"
         },
         {
-          "alias": "python_native_code_ld_flags",
+          "alias": "python_bootstrap_search_path",
           "default": null,
-          "description": "Overrides the default value from the option `[python-native-code].ld_flags` when this environment target is active.",
+          "description": "Overrides the default value from the option `[python-bootstrap].search_path` when this environment target is active.",
           "provider": "pants.core",
           "required": false,
           "type_hint": "Iterable[str] | None"
         },
         {
-          "alias": "python_bootstrap_search_path",
+          "alias": "python_native_code_cpp_flags",
           "default": null,
-          "description": "Overrides the default value from the option `[python-bootstrap].search_path` when this environment target is active.",
+          "description": "Overrides the default value from the option `[python-native-code].cpp_flags` when this environment target is active.",
           "provider": "pants.core",
           "required": false,
           "type_hint": "Iterable[str] | None"
         },
         {
-          "alias": "system_binaries_system_binary_paths",
+          "alias": "test_extra_env_vars",
           "default": null,
-          "description": "Overrides the default value from the option `[system-binaries].system_binary_paths` when this environment target is active.",
+          "description": "Overrides the default value from the option `[test].extra_env_vars` when this environment target is active.",
           "provider": "pants.core",
           "required": false,
           "type_hint": "Iterable[str] | None"

Pants version

2.19.0, 2.20.0a0

OS macOS

Additional info

N/A

huonw avatar Feb 27 '24 01:02 huonw

yep, that returned tuple should be sorted.

kaos avatar Feb 27 '24 09:02 kaos

This is mostly fixed by the sort, but there are some remaining. All the remaining diffs look to be in "name_to_api_type_info" with types that are provided in multiple places. I can find 2 categories:

  1. distinct_union_type_per_subclass, like "pants.core.goals.test.Batch". This is caused bydistinct_union_type_per_subclass not changing the generated class's __name__, but changing the __qualname__. But the help printer computes (something like) the qualname with f"{api_type.__module__}.{api_type.__name__}", which isn't the same, since we fake the __qualname__ but don't change the module. I think we're torn between providing a pointer to a class that exists at a location where it doesn't or vice-versa. I think using __qualname__ here makes more sense, giving us all the types with correct info but requiring people to traverse the union.
  2. classes that are provided through 2 different paths, for example BlackRequest.Batch is provided both by "pants.backend.build_files.fmt.black" and "pants.backend.python.lint.black". I'm not sure that there's a good way around this. We could enforce uniqueness by using fully-qualified identifiers for the keys. Or we could enforce an insertion order so the shadowing is deterministic, although the shadowing seems like a problem.

lilatomic avatar Apr 07 '24 21:04 lilatomic