bazel icon indicating copy to clipboard operation
bazel copied to clipboard

"compiler_files" within "CcToolchainInfo" is always empty when accessed in Starlark

Open rdeushane opened this issue 1 year ago • 3 comments
trafficstars

Description of the bug:

When creating a hermetic toolchain with cc_toolchain(), we provide an exact list of files required for compilation through compiler_files. This list of files is sandboxed as expected when using native C/C++ rules, or when performing compilation through cc_common.compile.

I would expect that after resolving the toolchain in Starlark and calling CcToolchainInfo.compiler_files, I would receive that same list of files, but this is not the case. The returned list of files is always empty, []. As a result, we have to use CcToolchainInfo.all_files, which is a significantly larger set of files than we require.

The impact of this is that all actions registered through ctx.actions.run which would otherwise require only compile_files, have to use the entirety of all_files, causing significant bloat to the inputs of these actions.

Which category does this issue belong to?

Starlark Integration

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

Linux

What is the output of bazel info release?

$ bazel info release INFO: Invocation ID: d38d4c0b-c82e-47b0-b085-3da34eba379d release 6.4.0- (@non-git)

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

Our organization has made minor tweaks to facilitate coverage generation in our repository. I would not expect these changes to influence this issue. https://github.com/SibrosTech/bazel

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

rdeushane avatar May 20 '24 19:05 rdeushane

I don't see any evidence or trace in the code base that would make CcToolchainInfo.compiler_files empty, either at head or in version 6.4.0.

Please provide a reproducer for one of the unmodified Bazel releases.

comius avatar May 23 '24 10:05 comius

@comius Please check out my reproduction repo here. This is written to use 7.0.0.

The output of:

    cc_toolchain = find_cpp_toolchain(ctx)
    print("Compiler: " + cc_toolchain.compiler)
    print("CcToolchainInfo methods: \n" + " ".join(dir(cc_toolchain)))
    print("cc_toolchain.compiler_files: {}".format(cc_toolchain.compiler_files))
    print("cc_toolchain.compiler_files methods: {}".format(dir(cc_toolchain.compiler_files)))

is:

DEBUG: /home/sibrosian/bazel_22454_repro/hello_world/custom_rule.bzl:9:10: Compiler: clang
DEBUG: /home/sibrosian/bazel_22454_repro/hello_world/custom_rule.bzl:10:10: CcToolchainInfo methods: 
all_files ar_executable ar_files as_files build_info_files built_in_include_directories compiler compiler_executable compiler_files coverage_files cpu dwp_files dynamic_runtime_lib dynamic_runtime_solib_dir fdo_context gcov_executable get_abi get_abi_glibc_version get_additional_make_variables get_all_files_including_libc get_artifact_name_for_category get_build_variables get_crosstool_top_path ld_executable legacy_cc_flags_make_variable libc linker_files needs_pic_for_dynamic_libraries nm_executable objcopy_executable objcopy_files objdump_executable preprocessor_executable runtime_sysroot solib_dir static_runtime_lib strip_executable strip_files sysroot target_gnu_system_name to_json to_proto tool_path toolchain_id
DEBUG: /home/sibrosian/bazel_22454_repro/hello_world/custom_rule.bzl:11:10: cc_toolchain.compiler_files: <built-in method compiler_files of CcToolchainInfo value>
DEBUG: /home/sibrosian/bazel_22454_repro/hello_world/custom_rule.bzl:12:10: cc_toolchain.compiler_files methods: []

What's also interesting is that the compiler files field that I'm trying to access doesn't seem to be documented here. Is there not a sanctioned way to access only compiler files for custom actions?

rdeushane avatar May 29 '24 18:05 rdeushane

A mistake I made here, compiler_files, and any of the other tool specific file methods aside from all_files are intentionally hidden as private:

ERROR: /home/sibrosian/bazel_22454_repro/hello_world/BUILD:3:15: in cc_custom_rule rule //hello_world:hello_world: 
Traceback (most recent call last):
        File "/home/sibrosian/bazel_22454_repro/hello_world/custom_rule.bzl", line 12, column 87, in _cc_custom_rule_impl
                print("cc_toolchain.compiler_files methods: {}".format(cc_toolchain.compiler_files()))
Error in compiler_files: file '//hello_world:custom_rule.bzl' cannot use private API

We have custom aspects that run C/C++ lint actions on our large codebase. The linters only require access to compiler_files, but we instead have to use all_files as the action inputs, which we believe is causing extreme ram consumption and slowdowns when run at scale.

rdeushane avatar May 29 '24 20:05 rdeushane

Since it seems you have a modified version of bazel, I can suggest that you remove the protection of the field. Bazel at head exposes ._compiler_files without a protection.

In both cases, there's a warning that we will change the contents of that field in the future, from list of files to FilesToRun to better support runfiles. That's the risk, but I'm guessing it should be easy to fix the code.

comius avatar May 31 '24 14:05 comius

@comius I'll give it a shot, glad to hear it's exposed at head now. Thanks!

rdeushane avatar May 31 '24 17:05 rdeushane