rules_go icon indicating copy to clipboard operation
rules_go copied to clipboard

Calls to https://go.dev/dl/?mode=json are breaking airgapped builds - provide way to avoid these

Open peakschris opened this issue 1 year ago • 8 comments

What version of rules_go are you using?

0.47.1

What version of gazelle are you using?

0.36.0

What version of Bazel are you using?

7.2.0rc1

Does this issue reproduce with the latest releases of all the above?

Yes

What operating system and processor architecture are you using?

Windows

Any other potentially useful information about your toolchain?

N/a

What did you do?

Attempting to run bazel build from airgapped environment, the build fails unable to connect to https://go.dev/dl/?mode=json&include=all. We have created some mirrors for github.com, dl.google.com, etc on our internal artifactory, but artifactory will not mirror this URL as it uses a query.

I would like some way to input the required information into the use of rules_go in our MODULE.bazel file, and avoid the need for rules_go to make this request. I see in the code that a parameter named sdks can be used to avoid this request, but I can't see how to provide it in a module-based build.

An alternative would be to allow this URL to be overridden in the module file; then I could download this file and vendor it as versions.json

peakschris avatar May 17 '24 12:05 peakschris

This is where the request is made: https://github.com/bazelbuild/rules_go/blob/e7ddb9ea474e6b5137dfc074f913529df80d7e5c/go/private/sdk.bzl#L75

peakschris avatar May 17 '24 12:05 peakschris

This is an interesting problem that I think can be solved by introducing an intermediate extension that records the extracted information in the MODULE.bazel.lock file. I'll give this some more thought and try to implement it.

fmeum avatar May 17 '24 22:05 fmeum

@fmeum thank you for looking at this! For us, we cannot source manage MODULE.bazel.lock because we don't use git and the unsupported merge of this huge file (it is 2MB for us) will be incorrect more often than not. So a solution involving direct specification of the inputs in MODULE.bazel or in a separate file would be much better

peakschris avatar May 17 '24 23:05 peakschris

@peakschris When you say "don't use Git", what are you using instead and why does that prevent you from checking in MODULE.bazel.lock? Bazel 7.2.0rc1 comes with a revised lockfile format that is much more VCS friendly and less verbose.

Generally speaking, I would like to first find a general solution that works out of the box, but I'm sure we'll find a way to support your use case.

fmeum avatar May 18 '24 09:05 fmeum

@fmeum we use a custom VCS layer, with perforce as the backend. The VCS layer was written years ago and is in maintenance mode now, so it's practically impossible to get significant changes made. It does not support custom merge drivers.

We also have a significant problem with contentious files that cause 'MR's to back up in our test pipeline. MODULE.bazel.lock would be a massively contentious file.

So there are two issues:

  1. MODULE.bazel.lock can't be auto-merged in our system and will be broken often. I can't give a tool to developers to use manually as some merge steps happen in CI via a web UI.
  2. MODULE.bazel.lock would cause MRs to queue up and retest. We don't have a good merge train like github.

I'm sure this will be an issue for others too, who have similar constraints. I don't have these issues with rules_js, etc, and I like their approach of providing the inputs to the toolchain in the MODULE.bazel file.

We can split up MODULE.bazel via include statements that allow different elements of the build to be specified in smaller files that are not contentious and are obvious to merge graphically.

Thanks again for thinking about this

peakschris avatar May 18 '24 10:05 peakschris

I've actually stumbled on a way to avoid these queries:

bazel_dep(name = "rules_go", version = "0.47.1")
go_sdk = use_extension("@rules_go//go:extensions.bzl", "go_sdk")
go_sdk.download(
    version = "1.22.3",
    # explicitly specify SDK names/checksums to avoid a query which fails in airgapped builds
    # get checksums from https://go.dev/dl/?mode=json&include=all
    sdks = {
        "linux_amd64": ("go1.22.3.linux-amd64.tar.gz", "8920ea521bad8f6b7bc377b4824982e011c19af27df88a815e3586ea895f1b36"),
        "windows_arm64": ("go.1.22.3.windows-arm64.tar.gz", "59b76ee22b9b1c3afbf7f50e3cb4edb954d6c0d25e5e029ab5483a6804d61e71"),
    },
)

Would it be possible to update documentation (https://github.com/bazelbuild/rules_go/blob/master/go/toolchains.rst#go_download_sdk) to include examples of use with Modules? This is what I was missing.

peakschris avatar May 19 '24 13:05 peakschris

Thanks for the tip, @peakschris!

Building on yours, I'm using the following curl + jq command to dump output we can copy/paste into our MODULE.bazel file.

This command ...

curl -s https://go.dev/dl/\?mode\=json\&include\=all | jq -r '.[] | select(.version == "go1.19.13") | .files | .[] | select(.kind == "archive" and (.os == "linux" or .os == "darwin") and (.arch == "arm64" or .arch == "amd64") ) | "\"\(.os)_\(.arch)\": (\"\(.filename)\", \"\(.sha256)\")" '

... will produce this output you can copy/paste into sdks = { ... } ...

"darwin_amd64": ("go1.19.13.darwin-amd64.tar.gz", "1b4329dc9e73def7f894ca71fce78bb9f3f5c4c8671b6c7e4f363a3f47e88325")
"darwin_arm64": ("go1.19.13.darwin-arm64.tar.gz", "022b35fa9c79b9457fa4a14fd9c4cf5f8ea315a8f2e3b3cd949fea55e11a7d7b")
"linux_amd64": ("go1.19.13.linux-amd64.tar.gz", "4643d4c29c55f53fa0349367d7f1bb5ca554ea6ef528c146825b0f8464e2e668")
"linux_arm64": ("go1.19.13.linux-arm64.tar.gz", "1142ada7bba786d299812b23edd446761a54efbbcde346c2f0bc69ca6a007b58")

fpotter avatar Aug 22 '24 20:08 fpotter

A dependency free Python version:

#!/usr/bin/env python3

import urllib.request
import json
import sys


def main():
    if len(sys.argv) != 2:
        print("Usage: main.py <go_version>")
        print("Example: main.py 1.24.2")
        sys.exit(1)

    semver = sys.argv[1]
    url = "https://go.dev/dl/?mode=json&include=all"
    version = "go" + semver

    try:
        with urllib.request.urlopen(url) as response:
            data = json.loads(response.read().decode())
    except Exception as e:
        print(f"Error fetching or parsing data: {e}")
        sys.exit(1)

    files_by_platform = {}

    for entry in data:
        if entry.get("version") == version:
            for file in entry.get("files", []):
                if (
                    file.get("kind") == "archive"
                    and file.get("os") in ("linux", "darwin")
                    and file.get("arch") in ("arm64", "amd64")
                ):
                    key = f'{file["os"]}_{file["arch"]}'
                    files_by_platform[key] = (file["filename"], file["sha256"])
            break

    if not files_by_platform:
        print(f"Version '{version}' not found in the Go releases list.")
        sys.exit(1)

    print("go_sdk.download(")
    print(f'    version = "{semver}",')
    print("    sdks = {")
    for platform, (filename, sha256) in sorted(files_by_platform.items()):
        print(f'        "{platform}": ("{filename}", "{sha256}"),')
    print("    },")
    print(")")


if __name__ == "__main__":
    main()

Then upgrades are simply calling the py_binary target and copy stdout:

bazel run //python/target_name:bin -- 1.24.2

eightseventhreethree avatar Apr 16 '25 00:04 eightseventhreethree