bazel
bazel copied to clipboard
Bazel 7 unable to finalize action due to missing digest for `.d` files when `--experimental_inmemory_dotd_files` is set.
Description of the bug:
When using --experimental_inmemory_dotd_files
which seems to be the default, at least in Bazel 7, the .d
file actions fail with a missing digest error.
ERROR: Foo/BUILD.bazel:11:15: Compiling Foo.c failed: unable to finalize action: Missing digest: <number>/<number> for bazel-out/ios_arm64-opt-ios-arm64-min12.0-applebin_ios-ST-<sha>/bin/path/to/Foo.d
Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
I haven't found a way to consistently reproduce this locally, our CI machines which are configured to:
- Don't use a disk cache
- Don't use remote execution
- Use a remote cache
Failed several times in our Bazel 7 testing, after setting --noexperimental_inmemory_dotd_files
we no longer saw this issue.
Which operating system are you running Bazel on?
macOS
What is the output of bazel info release
?
release 7.1.1
If bazel info release
returns development version
or (@non-git)
, tell us how you built Bazel.
No response
What's the output of git remote get-url origin; git rev-parse HEAD
?
No response
Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.
Yes, we've never seen this issue in Bazel 6 and not much has changed in our Bazel 7 testing in terms of flags.
Have you found anything relevant by searching the web?
- https://github.com/bazelbuild/bazel/issues/20123 is potentially related but not exactly the same
Any other information, logs, or outputs that you want to share?
No response
We looked for something that could go wrong with the combination of --remote_cache
, --remote_download_all
and --experimental_inmemory_dotd_files
, but don't have a plausible theory yet (other than the remote cache spuriously evicting blobs - but that doesn't explain why it only happens with .d
files, and only when in-memory outputs are enabled).
Can you provide the following information:
- The complete list of Bazel flags you're using
- The remote cache implementation you're using
- The
--experimental_remote_grpc_log
for one of the failed invocations (feel free to scrub sensitive data but please preserve the digests, or rewrite them in such a way that they match up between gprc requests)
In addition, it would be helpful to know the following:
- Can you repro this against a disk cache, or a different remote cache implementation? (e.g. a simple HTTP cache that is guaranteed to never evict any blobs on its own)
Thanks for investigating @tjgq
I can provide the first two now and look at the execution log when I get a chance:
- The
--announce_rc
logs for our flags in CI:
INFO: Invocation ID: <ID>
INFO: Reading 'startup' options from /Users/build/.jenkins/workspace/cash-ios/ios-builder/s/c/.bazelrc: --host_jvm_args=-Djavax.net.ssl.trustStore=Configuration/Java.cacerts, --host_jvm_args=-Djavax.net.ssl.trustStorePassword=changeit, --host_jvm_args=-DBAZEL_TRACK_SOURCE_DIRECTORIES=1, --max_idle_secs=86400, --digest_function=blake3
INFO: Options provided by the client:
Inherited 'common' options: --isatty=0 --terminal_columns=80
INFO: Reading rc options for 'build' from /Users/build/.jenkins/workspace/cash-ios/ios-builder/s/c/.bazelrc:
Inherited 'common' options: --remote_header=hashfn=blake3 --lockfile_mode=update --incompatible_disallow_empty_glob --experimental_repository_downloader_retries=3 --incompatible_strict_action_env=true --spawn_strategy=local --verbose_failures --test_output=errors --max_config_changes_to_show=-1 --attempt_to_print_relative_paths --experimental_inprocess_symlink_creation --keep_going --output_filter=^//.*:((?!(SwiftLintCore|SwiftLintBuiltInRules).*).)*$ --noexperimental_inmemory_dotd_files --compilation_mode=dbg --@build_bazel_rules_swift//swift:copt=-whole-module-optimization --@build_bazel_rules_swift//swift:exec_copt=-whole-module-optimization --@rules_xcodeproj//xcodeproj:extra_common_flags=--//Bazel:is_building_in_xcode=0 --features=swift.emit_symbol_graph_extension_blocks --action_env=CACHE_EPOCH=4 --remote_download_outputs=all --config=cache_cdn_read --noremote_upload_local_results --remote_local_fallback --experimental_remote_merkle_tree_cache --experimental_guard_against_concurrent_changes --disk_cache=~/Library/Caches/bazel-cash-ios-cache --remote_build_event_upload=minimal --nolegacy_important_outputs --modify_execution_info=^(AppleLipo|BitcodeSymbolsCopy|BundleApp|BundleTreeApp|DsymDwarf|DsymLipo|GenerateAppleSymbolsFile|ObjcBinarySymbolStrip|CppArchive|CppLink|ObjcLink|ProcessAndSign|SignBinary|SwiftArchive|SwiftStdlibCopy|PackagingFramework.+|ExtendModulemap|HmapCreate)$=+no-remote,^(BundleResources|ImportedDynamicFrameworkProcessor)$=+no-remote-exec --remote_cache_compression=true --xcode_version_config=//Bazel:host_xcodes --macos_minimum_os=13.0 --host_macos_minimum_os=13.0 --config virtual_frameworks --features=-swift.vfsoverlay --@build_bazel_rules_apple//apple/build_settings:use_tree_artifacts_outputs=true --define=apple.incompatible.objc_framework_propagate_modulemap=true
INFO: Reading rc options for 'build' from /Users/build/.jenkins/workspace/cash-ios/ios-builder/s/c/.bazelrc:
'build' options: --flag_alias=build_config=//Bazel:build_config --flag_alias=release_variant=//Bazel:release_variant --flag_alias=xcscheme=//Bazel/apple/xcschemes:xcscheme
INFO: Found applicable config definition common:cache_cdn_read in file /Users/build/.jenkins/workspace/cash-ios/ios-builder/s/c/.bazelrc: --remote_cache=<REDACTED>
INFO: Found applicable config definition common:virtual_frameworks in file /Users/build/.jenkins/workspace/cash-ios/ios-builder/s/c/.bazelrc: --features apple.virtualize_frameworks
INFO: Found applicable config definition common:ci in file /Users/build/.jenkins/workspace/cash-ios/ios-builder/s/c/.bazelrc: --remote_upload_local_results --build_metadata=ROLE=CI --announce_rc --color=no --curses=no --noshow_loading_progress --show_progress_rate_limit=15.0 --progress_report_interval=60 --disk_cache=
INFO: Found applicable config definition common:cache_grpc in file /Users/build/.jenkins/workspace/cash-ios/ios-builder/s/c/.bazelrc: --remote_cache=grpcs://bazel-remote-vpce-service-privatelink.squarecloudservices.com --experimental_remote_cache_async=true
INFO: Found applicable config definition common:ios_release in file /Users/build/.jenkins/workspace/cash-ios/ios-builder/s/c/.bazelrc: --config=release --ios_multi_cpus=arm64 --@build_bazel_rules_apple//apple/build_settings:use_tree_artifacts_outputs=false --config=generate_dsym --objc_enable_binary_stripping --define=apple.trim_lproj_locales=yes --features=dead_strip --features=swift.opt_uses_wmo --@build_bazel_rules_swift//swift:copt=-Xfrontend --@build_bazel_rules_swift//swift:copt=-internalize-at-link
INFO: Found applicable config definition common:release in file /Users/build/.jenkins/workspace/cash-ios/ios-builder/s/c/.bazelrc: --build_config=release --compilation_mode=opt --//Pods/cocoapods-bazel:config=release --//Pods/cocoapods-bazel:deps_config=deps_release
INFO: Found applicable config definition common:generate_dsym in file /Users/build/.jenkins/workspace/cash-ios/ios-builder/s/c/.bazelrc: --apple_generate_dsym --output_groups=+dsyms
INFO: Found applicable config definition common:alpha in file /Users/build/.jenkins/workspace/cash-ios/ios-builder/s/c/.bazelrc: --release_variant=alpha
- The remote cache implementation we're using: https://github.com/bazel-ios/bazel-buildfarm/tree/bazel-ios-fork
@luispadron Can you provide the --experimental_remote_grpc_log
for a build exhibiting this failure? Otherwise, it's going to be difficult to make progress on this.
Hi. We're also seeing this issue although it's very intermittent with only 6 builds out of 46,046 in the last week impacted. We're using a simple HTTP cache (nginx caching proxy in front of Artifactory) and are quite confident it's not a cache issue.
I notice in your first message you jumped to --remote_download_all
. Our builds are a mixture of --remote_download_all
and --remote_download_toplevel
and while we aren't seeing this often it does appear it's always with --remote_download_all
builds.
Please let us know what additional information/logs would be helpful.
@miscott2 I think the --experimental_remote_grpc_log
for one of the failed runs would be the most useful piece of information here. (You can scrub any sensitive information, but please preserve the digests.)
Oh wait, but if you're using an HTTP cache, there's no gRPC log; nevermind.
@tjgq Since, as miscott2 says, we're using HTTP rather than GRPC, is there some other information that would be useful in that case? In one of your previous posts you asked if this could be reproduced using an HTTP cache, to which the answer very much appears to be "yes", so is there a way to gather useful info in that case?
@NeilKetley What's the eviction policy for your HTTP cache? i.e., do you have any automated process that periodically removes old entries from the cache? Would you be able to confirm whether the blob in question was present in the cache at some point, but later got deleted? I'm wondering whether this might be just a special case of #18696.
I'm going with the theory that this is the same as #18696, which has been fixed in 7.4.0. Please reopen if you're seeing similar failures in 7.4.0 or later.
@tjgq apologies for not responding sooner. We are still attempting to repro and collect the information / answer the questions you posed previously. I do not think we will be able to try a later version of Bazel at this point since this issue is happening in our live build system, not really open for experimentation, but we will collect the info requested and hope that this will either confirm your suspicion or show otherwise.