rules_go icon indicating copy to clipboard operation
rules_go copied to clipboard

Feature request: Provide full path to go binary via an environment variable

Open fishy opened this issue 1 year ago • 10 comments

What version of rules_go are you using?

0.50.1

What version of gazelle are you using?

0.39.0

What version of Bazel are you using?

7.3.2

Does this issue reproduce with the latest releases of all the above?

Yes

What operating system and processor architecture are you using?

linux/amd64

Any other potentially useful information about your toolchain?

What did you do?

For reasons, we have an internal library that heavily depends on go binary (it runs go commands via os/exec, etc.). In its provided test lib, when setting up a test it also depends on the go binary to do the setup.

Without bazel, it could just assume go is in $PATH and run it blindly. But with bazel that assumption is broken and it needs to know where to find the go binary. This isn't a problem when developers run unit tests locally because we can still assume they have go in $PATH even if they run bazel to build/test locally, but it's a problem for CI because on CI we use a bazel based docker image without go in $PATH.

Thus why it would be great if rules_go can inject that info via some environment variable, so it can gets that info via env var (only needed in tests if we don't want to inject it everywhere).

I tried to check whether go toolchain itself does this, with test code to print everything from os.Environ, and found out that $_ would be pointing to the go binary when running go test.

e.g. With this test code:

package test

import (
        "os"
        "testing"
)

func TestPrintEnvs(t *testing.T) {
        t.Errorf("$_ = %q", os.Getenv("_"))
}

via go test it prints out:

$ go test
--- FAIL: TestPrintEnvs (0.00s)
    go_test.go:9: $_ = "/usr/lib/go-1.23/bin/go"
FAIL
exit status 1

but via bazel test :test_test it prints out something else:

INFO: From Testing //test:test_test:
==================== Test output for //test:test_test:
--- FAIL: TestPrintEnvs (0.00s)
    go_test.go:9: $_ = "/home/fishy/.cache/bazel/_bazel_fishy/d935e50f2b77a06b1180e48ced8db199/sandbox/linux-sandbox/1173/execroot/_main/bazel-out/k8-fastbuild/bin/test/test_test_/test_test.runfiles/_main/test/test_test_/test_test"
FAIL
================================================================================

What did you expect to see?

What did you see instead?

fishy avatar Oct 08 '24 16:10 fishy

Since this appears to be a very special case, how about adding a Go SDK to the runfiles of your tests and then passing its go binary to the test?

That's something you can do without requiring a rules_go change. Even if we wanted to, we couldn't just ship a full Go SDK with every test without regressing performance for everyone.

fmeum avatar Oct 08 '24 18:10 fmeum

@fmeum thanks but apologize in advance that I'm not sure how to achieve what you suggested.

  1. it doesn't look like go_test rule has attributes like runfile or runfiles. It does have data but I'm not sure if that would work for this purpose.
  2. how do I reference the go_sdk there? our go_sdk is declared in MODULE.bazel like this:
go_sdk = use_extension("@rules_go//go:extensions.bzl", "go_sdk")
go_sdk.download(
    name = "go_sdk",
    sdks = {
        "darwin_amd64": (
            "go%s.darwin-amd64.tar.gz" % GO_VERSION,
            GO_DARWIN_AMD64_SHA256,
        ),
        "darwin_arm64": (
            "go%s.darwin-arm64.tar.gz" % GO_VERSION,
            GO_DARWIN_ARM64_SHA256,
        ),
        "linux_amd64": (
            "go%s.linux-amd64.tar.gz" % GO_VERSION,
            GO_LINUX_AMD64_SHA256,
        ),
    },
    urls = [
        "https://go.dev/dl/{}",
        "https://dl.google.com/go/{}",
        # TODO: Add an internal mirror here
    ],
    version = GO_VERSION,
)

fishy avatar Oct 09 '24 16:10 fishy

@fishy Yes, adding the SDK target to data will place it into the test's runfiles. Here's an example from the repo's own tests:

go_test(
    name = "stdliblist_test",
    size = "small",
    srcs = [
        "env.go",
        "flags.go",
        "replicate.go",
        "stdliblist.go",
        "stdliblist_test.go",
    ],
    data = ["@go_sdk//:files"],
    rundir = ".",
    x_defs = {
        "rulesGoStdlibPrefix": RULES_GO_STDLIB_PREFIX,
    },
)

You can check available targets with bazel query @go_sdk//:all

Also potentially relevant for you:

➜  rules_go git:(master) bazel query --output=build @go_sdk//:go_sdk                                                                                                                                                                                           ~/rules_go
# /private/var/tmp/_bazel_dzbarsky/2c76d2042b2c36d4458efe14b865e618/external/go_sdk/BUILD.bazel:61:7
go_sdk(
  name = "go_sdk",
  goos = "darwin",
  goarch = "arm64",
  experiments = "nocoverageredesign",
  root_file = "@go_sdk//:ROOT",
  package_list = "@go_sdk//:package_list",
  libs = ["@go_sdk//:libs"],
  headers = ["@go_sdk//:headers"],
  srcs = ["@go_sdk//:srcs"],
  tools = ["@go_sdk//:tools"],
  go = "@go_sdk//:bin/go",
  version = "1.23.1",
)
# Rule go_sdk instantiated at (most recent call last):
#   /private/var/tmp/_bazel_dzbarsky/2c76d2042b2c36d4458efe14b865e618/external/go_sdk/BUILD.bazel:61:7 in <toplevel>
# Rule go_sdk defined at (most recent call last):
#   /Users/dzbarsky/rules_go/go/private/rules/sdk.bzl:39:14 in <toplevel>

dzbarsky avatar Dec 24 '24 18:12 dzbarsky

@dzbarsky thanks for the help. I think I got the files from @go_sdk//:files added to the test's runfiles, but I'm not sure how to pass that info into the test code. This is what I'm doing:

(files under $WORKSPACE_ROOT/gobin)

$ cat gobin_test.go 
package gobin

import (
        "io/fs"
        "os"
        "path/filepath"
        "testing"
)

func walkRecursion(tb testing.TB, root string) {
        fs.WalkDir(os.DirFS(root), ".", func(path string, d fs.DirEntry, err error) error {
                if err != nil {
                        tb.Errorf("fs.WalkDir: %q, %q, %v, %v", root, path, d, err)
                        return err
                }
                path = filepath.Join(root, path)
                tb.Log(path)
                if d.Type()&fs.ModeSymlink != 0 {
                        walkRecursion(tb, path)
                }
                return nil
        })
}

func TestGobin(t *testing.T) {
        t.Logf("$PWD = %q", os.Getenv("PWD"))
        walkRecursion(t, os.Getenv("PWD"))
        t.Errorf("$GOBIN = %q", os.Getenv("GOBIN"))
}

$ cat BUILD.bazel 
load("@rules_go//go:def.bzl", "go_test")

go_test(
    name = "gobin_test",
    srcs = ["gobin_test.go"],
    data = ["@go_sdk//:files"],
    env = {
        "GOBIN": "@go_sdk//:bin/go",
    },
    rundir = ".",
)

$ bazel test :gobin_test 
INFO: Invocation ID: 63b1fa39-ab9b-495e-9fa1-537ad24df93b
INFO: Analyzed target //gobin:gobin_test (0 packages loaded, 0 targets configured).
FAIL: //gobin:gobin_test (Exit 1) (see /home/fishy/.cache/bazel/_bazel_fishy/d935e50f2b77a06b1180e48ced8db199/execroot/_main/bazel-out/k8-fastbuild/testlogs/gobin/gobin_test/test.log)
INFO: From Testing //gobin:gobin_test:
==================== Test output for //gobin:gobin_test:
--- FAIL: TestGobin (0.00s)
    gobin_test.go:26: $PWD = "/home/fishy/.cache/bazel/_bazel_fishy/d935e50f2b77a06b1180e48ced8db199/sandbox/linux-sandbox/147/execroot/_main/bazel-out/k8-fastbuild/bin/gobin/gobin_test_/gobin_test.runfiles/_main/."
    gobin_test.go:17: /home/fishy/.cache/bazel/_bazel_fishy/d935e50f2b77a06b1180e48ced8db199/sandbox/linux-sandbox/147/execroot/_main/bazel-out/k8-fastbuild/bin/gobin/gobin_test_/gobin_test.runfiles/_main
    gobin_test.go:17: /home/fishy/.cache/bazel/_bazel_fishy/d935e50f2b77a06b1180e48ced8db199/sandbox/linux-sandbox/147/execroot/_main/bazel-out/k8-fastbuild/bin/gobin/gobin_test_/gobin_test.runfiles/_main/gobin
    gobin_test.go:17: /home/fishy/.cache/bazel/_bazel_fishy/d935e50f2b77a06b1180e48ced8db199/sandbox/linux-sandbox/147/execroot/_main/bazel-out/k8-fastbuild/bin/gobin/gobin_test_/gobin_test.runfiles/_main/gobin/gobin_test_
    gobin_test.go:17: /home/fishy/.cache/bazel/_bazel_fishy/d935e50f2b77a06b1180e48ced8db199/sandbox/linux-sandbox/147/execroot/_main/bazel-out/k8-fastbuild/bin/gobin/gobin_test_/gobin_test.runfiles/_main/gobin/gobin_test_/gobin_test
    gobin_test.go:13: fs.WalkDir: "/home/fishy/.cache/bazel/_bazel_fishy/d935e50f2b77a06b1180e48ced8db199/sandbox/linux-sandbox/147/execroot/_main/bazel-out/k8-fastbuild/bin/gobin/gobin_test_/gobin_test.runfiles/_main/gobin/gobin_test_/gobin_test", ".", <nil>, stat .: not a directory
    gobin_test.go:28: $GOBIN = "@go_sdk//:bin/go"
FAIL
================================================================================
INFO: Found 1 test target...
Target //gobin:gobin_test up-to-date:
  bazel-bin/gobin/gobin_test_/gobin_test
INFO: Elapsed time: 0.304s, Critical Path: 0.18s
INFO: 2 processes: 1 internal, 1 linux-sandbox.
INFO: Build completed, 1 test FAILED, 2 total actions
//gobin:gobin_test                                                       FAILED in 0.0s
  /home/fishy/.cache/bazel/_bazel_fishy/d935e50f2b77a06b1180e48ced8db199/execroot/_main/bazel-out/k8-fastbuild/testlogs/gobin/gobin_test/test.log

Executed 1 out of 1 test: 1 fails locally.

so my attempt of using go_test.env doesn't actually expand @go_sdk//:bin/go to the full path (I'm getting that string literal in the code instead), and from the rundir there's only gobin/gobin_test_/gobin_test which seems to be the symlink to the test binary instead of the runfiles.

fishy avatar Dec 27 '24 22:12 fishy

I do see the go sdk files under $WORKSPACE_ROOT/bazel-bin/gobin/gobin_test_/gobin_test.runfiles/rules_go++go_sdk+go_sdk/, but I'm not sure

  1. how stable is the rules_go++go_sdk+go_sdk name
  2. how to reference that path in the code and whether that's available when bazel test runs

also I'm wondering if there's an arg to bazel test so that it does not clean up the execroot it used to run the test (from the test log, after bazel test finishes those directories no longer exist)

fishy avatar Dec 27 '24 23:12 fishy

Instead of just the label, set the environment variable to $(rootpath @go_sdk//:bin/go). For cross-platform support, you can also use $(rlocationpath ...) and the Go runfiles library.

fmeum avatar Dec 28 '24 00:12 fmeum

You can also use the --sandbox_debug flag to keep the rest sandbox around after the run

dzbarsky avatar Dec 28 '24 00:12 dzbarsky

Thanks @fmeum & @dzbarsky , that worked (at least for local worker, it might have issues with remote workers since all the files under runfiles from @go_sdk//:files are symlinks and the real files may or may not be packed correctly to be sent to remote workers, but we don't currently have remote workers to test that and "works in local worker" is good enough for us for now 😄)

fishy avatar Dec 30 '24 16:12 fishy

Excellent! If you're on Linux, you can try to enable the hermetic-sandbox, which will hardlink files instead of symlink. It should be a decent proxy for RBE though the details depend on your exact RBE setup.

dzbarsky avatar Dec 30 '24 16:12 dzbarsky

is --strategy=worker,linux-sandbox enough to enable hermetic sandbox? the test works with that arg after I cleared bazel cache

fishy avatar Dec 30 '24 17:12 fishy