rules_swift icon indicating copy to clipboard operation
rules_swift copied to clipboard

Flaky toolchain resolution

Open jpsim opened this issue 2 years ago • 12 comments

I occasionally see this error:

ERROR: /private/var/tmp/_bazel_runner/38952b[62](https://github.com/envoyproxy/envoy/actions/runs/3651410480/jobs/6168700603#step:5:63)5e1e7c72284b0f9be4a35639/external/build_bazel_rules_swift_local_config/BUILD:9:22: While resolving toolchains for target @build_bazel_rules_swift_local_config//:toolchain: No matching toolchains found for types @bazel_tools//tools/cpp:toolchain_type.

The fact that @build_bazel_rules_swift_local_config//:toolchain is in the error makes me think it could be a rules_swift bug?

Here's an example failure: https://github.com/envoyproxy/envoy/actions/runs/3651410480/jobs/6168700603 The same job passed on a re-run: https://github.com/envoyproxy/envoy/actions/runs/3651410480/jobs/6168942485

Steps to repro (not consistently reproducible):

./bazelw build \
  --config=ios \
  //examples/objective-c/hello_world:app
  • rules_swift: e769f8d6a4adae93c244f244480a3ae740f24384
  • rules_apple: a0f8748ce89698a599149d984999eaefd834c004
  • bazel: 6.0.0rc4
  • OS: macOS 13.1
  • Xcode: 14.1 (14B47b)

jpsim avatar Dec 08 '22 19:12 jpsim

I think the core issue is:

No matching toolchains found for types @bazel_tools//tools/cpp:toolchain_type.

which sounds like something failed to setup before this, but im surprised to not see that in the log

keith avatar Dec 08 '22 20:12 keith

What would you suggest as next steps to dive deeper? Should I add --toolchain_resolution_debug='@bazel_tools//tools/cpp:toolchain_type' to our CI jobs or would that not yield helpful results?

jpsim avatar Dec 08 '22 20:12 jpsim

if it reproduces frequently enough it would be worth hitting it at least once with that, but that will definitely generate some noise as well.

I imagine the core issue is the one we've seen in the past with github actions where the xcode discovery time takes so long that it ends up timing out, and then toolchain setup just fails

keith avatar Dec 08 '22 20:12 keith

I imagine the core issue is the one we've seen in the past with github actions where the xcode discovery time takes so long that it ends up timing out, and then toolchain setup just fails

Interesting theory. Is there a way we could wait for that before starting the bazel build? Like timeout 60s xcodebuild -version? Not sure what the Xcode discovery entails.

jpsim avatar Dec 08 '22 21:12 jpsim

My theory on that issue is that because it discovers every single installed Xcode version and the SDK versions for each one, on GitHub actions there are so many that it takes a while. My recommendation, assuming that's the case, is to do this https://www.smileykeith.com/2021/03/08/locking-xcode-in-bazel/

keith avatar Dec 08 '22 21:12 keith

My theory on that issue is that because it discovers every single installed Xcode version and the SDK versions for each one, on GitHub actions there are so many that it takes a while. My recommendation, assuming that's the case, is to do this https://www.smileykeith.com/2021/03/08/locking-xcode-in-bazel/

I think we're already doing that: https://github.com/envoyproxy/envoy/blob/6040d0b15f2cf8d822fd9631b9b969cd78d8062a/mobile/.bazelrc

--xcode_version_config=//ci:xcode_config

With only a single version of Xcode configured: https://github.com/envoyproxy/envoy/blob/6040d0b15f2cf8d822fd9631b9b969cd78d8062a/mobile/ci/BUILD#L3-L32

jpsim avatar Dec 08 '22 21:12 jpsim

Looks like only in that one configuration

keith avatar Dec 08 '22 21:12 keith

Yes, but the CI job is using that configuration.

jpsim avatar Dec 08 '22 21:12 jpsim

Would setting --action_env=BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1 help?

jpsim avatar Dec 16 '22 15:12 jpsim

It would fail later if you tried to build anything

keith avatar Dec 16 '22 17:12 keith

If this is an Xcode discovery issue, could we add a way to explicitly specify the path or candidate paths to the Xcode version that's expected to be used?

In our case, we're hitting this on CI where the path to the Xcode to use is stable and well-known.

jpsim avatar Jan 13 '23 14:01 jpsim

Passing the Xcode version config should generally be doing that 🤔. Unless it's failing on the one that just checks if any Xcode is installed at all

keith avatar Jan 13 '23 14:01 keith