rules_go icon indicating copy to clipboard operation
rules_go copied to clipboard

Merge GoLibrary and GoSource providers

Open dzbarsky opened this issue 1 year ago • 15 comments

What type of PR is this? Code simplification

What does this PR do? Why is it needed? These providers are 1:1, and we always create a library followed by source_from_library. Merging them allows to remove the library field from GoSource as well as the the resolve field, since we don't need to smuggle the closure between the providers anymore.

If we do this, we will also need to fix up Gazelle. We can either have a patch in this repo so CI passes and then followup with the Gazelle fix, or somehow shim the API until we release new versions in lockstep. Open to ideas how to best do this.

Which issues(s) does this PR fix? One step toward https://github.com/bazelbuild/rules_go/issues/2578

Fixes #

Other notes for review

dzbarsky avatar Aug 11 '24 20:08 dzbarsky

Can we shim the library, perhaps by having GoLibrary remain an alias of GoSource for a while? Otherwise this change could be too breaking to dependents.

fmeum avatar Aug 12 '24 09:08 fmeum

Can we shim the library, perhaps by having GoLibrary remain an alias of GoSource for a while? Otherwise this change could be too breaking to dependents.

Aliasing them is a good idea, I think I'll keep the existing APIs for now but implement library_to_source in terms of new_source and remove the documentation. I think that should avoid breaking downstream repos (unless they are using source.library or something)

dzbarsky avatar Aug 12 '24 16:08 dzbarsky

From an end user perspective, I would think that a x_library rule should be producing a *Library provider rather than an *Source provider. So I would rather keep GoLibrary and ditch GoSource if possible.

sluongng avatar Aug 13 '24 07:08 sluongng

Can we just call it GoInfo and keep aliases around? That's arguably the canonical name.

fmeum avatar Aug 13 '24 08:08 fmeum

Yeah GoInfo works as well, following the convention of CcInfo and PyInfo.

👍 to keep aliases around for backward compatibility

sluongng avatar Aug 13 '24 08:08 sluongng

GoInfo is a great idea; updated. I've kept the source param name to GoArchive for now; I'd like to take a look at merging those as well after this lands. Other than that, I've renamed the usage in the repo

dzbarsky avatar Aug 13 '24 16:08 dzbarsky

Hey @dzbarsky, I'm having trouble with testing this PR with our thriftrw plugin. Here's the code for that:

def _go_thriftrw_library_impl(ctx):
    go = go_context(ctx)
    go_srcs = go_thriftrw_compile(
        go,
        ...
    )

    library = go.new_library(
        go,
        resolver = _resolve,
        srcs = go_srcs,
    )
    source = go.library_to_source(go, ctx.attr, library, ctx.coverage_instrumented())
    archive = go.archive(go, source)

    return [
        library,
        source,
        archive,
        DefaultInfo(
            files = depset([archive.data.file]),
            runfiles = archive.runfiles,
        ),
        ...


_go_thriftrw_library = rule(
    implementation = _go_thriftrw_library_impl,
    attrs = {
        "deps": attr.label_list(
            providers = [GoInfo], # just changed this
        ),
        ...
        "_go_context_data": attr.label(default = "@io_bazel_rules_go//:go_context_data"),

Here's the error I'm getting:

ERROR: /home/user/go-code/idl/../BUILD.bazel:13:20: in _go_thriftrw_library rule //idl/..:object_config_host_agent_go_thriftrw: 
Traceback (most recent call last):
        File "/home/user/go-code/rules/thrift/go_thriftrw.bzl", line 149, column 25, in _go_thriftrw_library_impl
                archive = go.archive(go, source)
        File "/home/user/.cache/bazel/_bazel_tfrench/b97476d719d716accead0f2d5b93104f/external/io_bazel_rules_go/go/private/actions/archive.bzl", line 76, column 23, in emit_archive
                files.append(a.runfiles)
Error: 'Target' value has no field or method 'runfiles' (did you mean 'files'?)

Do you know why this might be?

Also, I tried using load("@io_bazel_rules_go//go/private:context.bzl", "new_go_info") instead, and it also had the same issue. But, I'm wondering if we should attach new_go_info as a helper to go_context (as is mentioned in the readme).

Let me know if you have any ideas what might be happening. Thanks!

tyler-french avatar Aug 22 '24 21:08 tyler-french

@tyler-french Thanks for giving it a shot! I suspect that what might be happening is GoInfo.deps previously could be a list of either GoArchive providers or Targets, but now we expect it to be a list of GoArchive providers. I'm not sure what your _resolve function is doing, but perhaps that needs an update? Alternately, if that's the cause, we may be able to make _merge_embed ensure that the deps are normalized to provider fields, to avoid this being a breaking change. If it's possible to share the actual code, I'm happy to dig deeper!

I recommend to switch to calling new_go_info to future-proof, it should have been exposed in the defs.bzl, that was an oversight I noticed while also drafting release notes, I'll update this PR momentarily and also fix the docs. I would prefer not to put it on go_context for the following reasons:

  1. Loading it directly makes it more understandable for IDE bazel plugins, so you can click through it to jump to def
  2. I think it's weird API design to have to call go.new_go_info(go, ...) and also enforce that you are passing in the same context; it wants to be an instance method but those don't exist
  3. It's slightly more efficient to not add extra fields to the context struct that is created for every action :)

dzbarsky avatar Aug 23 '24 00:08 dzbarsky

@dzbarsky @fmeum @linzhp I think I would like to wait to merge this PR until after we release v0.50.0. And then once we do merge this, we can make a v0.51.0-rc1.

The only other breaking changes after 0.49 are changing the inputs to a depset and separating out Nogo.

That was we reduce the amount of breaking changes in this release, and also will soon give users the chance to migrate their providers.

I want to make sure we do our best to make sure consumers of rules_go have a good experience.

tyler-french avatar Aug 25 '24 23:08 tyler-french

@tyler-french That plan sounds good to me. BTW I'm not sure if your breakage with the thrift rules is caused by this PR, it might be another one of the PRs. Here's some potential breaking changenots I drafted (though most of these are very minor and unlikely to be an actual issue)

image image

dzbarsky avatar Aug 26 '24 00:08 dzbarsky

BTW I'm not sure if your breakage with the thrift rules is caused by this PR

@dzbarsky You're right, it seems to be a different PR. Here's the full code of the plugin:

load("//rules/thrift:thrift.bzl", "ThriftInfo")
load("@io_bazel_rules_go//go:def.bzl", "GoLibrary", "go_context")
load("@io_bazel_rules_go//go/private/rules:transition.bzl", "go_reset_target")
load("@bazel_skylib//lib:paths.bzl", "paths")

THRIFRW_OUTPUT_FILE = "thriftrw-gen.go"

GoThriftRWPlugin = provider(
    fields = [
        "expected_pkg_files",
        "plugin_name",
        "plugin",
        "plugin_deps",
        "plugin_flags",
    ],
)

def _resolve(go, attr, source, merge):
    plugin_deps = []
    for c in attr.plugins:
        plugin_deps.extend(c[GoThriftRWPlugin].plugin_deps)
    source["deps"] = source["deps"] + plugin_deps + attr._default_deps

def _thrift_file_name(file_path):
    file_name = paths.basename(file_path)
    words = paths.split_extension(file_name)
    if len(words) != 2:
        fail("Unable to parse the thrift file for name: {}".format(file_name))
    return words[0]

def _parse_pkg(package):
    package_tokens = package.split("/")
    if len(package_tokens) == 1:
        return "base", None
    elif len(package_tokens) == 2:
        service_word = package_tokens[1]

        pkg_suffixes = ["client", "server", "fx", "test"]
        for suffix in pkg_suffixes:
            if service_word.endswith(suffix):
                return suffix, service_word[:-len(suffix)]
        fail("unable to parse the type of package: {}".format(package))
    fail("the package: {} has an invalid number of components".format(package))

def _thriftrw_output(go, directory, package, source, plugins):
    """
    _thriftrw_output declares the files that should be generated by ThriftRW
    against a package. This selects the subset of files generated by ThriftRW
    that correspond to a ThriftRW generated package.
    """
    gen_files = []
    pkg_type, thrift_service = _parse_pkg(package)
    output_dir = None

    pkg_tmpls = []
    for m in [{"base": [THRIFRW_OUTPUT_FILE]}] + [p.expected_pkg_files for p in plugins]:
        pkg_tmpls += m.get(pkg_type, [])
    for t in pkg_tmpls:
        rel = paths.join(directory, _thrift_file_name(source.path), t.format(service_name = thrift_service))
        f = go.declare_file(go, rel)
        gen_files += [f]
        output_dir = f.path[:len(f.path) - len(rel)]
    return gen_files, output_dir

def go_thriftrw_compile(go, compiler, thrift_root, plugins, source, transitive_sources, importpath, package):
    """
    go_thriftrw_compile invokes the ThriftRW compiler with a particular plugin.
    Args:
        go: the Go context.
        compiler: the ThritRWCompiler.
        sources: the direct Thrift sources.
        transitive_sources: the transitive Thrift sources.
        importpath: the generated import path.
        package: the generated package.
    Returns:
        The generated Go sources.
    """
    thrift_root = paths.join(source.root.path, thrift_root)
    if not importpath.endswith(package):
        fail("package must be at end of import path:\n  {}\n  {}".format(importpath[-len(package):], package))

    directory = paths.dirname(source.path[len(thrift_root) + 1:])

    gen_files, out_dir = _thriftrw_output(go, directory, package, source, plugins)

    args = go.actions.args()
    args.add("--thrift-root", thrift_root)
    args.add("--output-file", THRIFRW_OUTPUT_FILE)
    args.add("--pkg-prefix", importpath.partition("/")[0])
    args.add("--no-recurse")
    for p in plugins:
        plugin_arg = p.plugin_name
        if p.plugin_flags:
            plugin_arg = "{} {}".format(p.plugin_name, p.plugin_flags)
        args.add("--plugin", plugin_arg)
    args.add("--out", out_dir)
    args.add(source)

    go.actions.run_shell(
        command = """
            export PATH={path}:$PATH &&
            exec {cmd} "$@"
        """.format(
            cmd = compiler.path,
            path = ":".join(["$(pwd)/{}".format(p.plugin.dirname) for p in plugins]),
        ),
        outputs = gen_files,
        inputs = transitive_sources,
        arguments = [args],
        tools = [compiler] + [p.plugin for p in plugins],
        mnemonic = "GoThriftrwGen",
    )

    return gen_files

def _go_thriftrw_library_impl(ctx):
    go = go_context(ctx)
    go_srcs = go_thriftrw_compile(
        go,
        compiler = ctx.executable.thriftrw,
        plugins = [c[GoThriftRWPlugin] for c in ctx.attr.plugins],
        thrift_root = ctx.attr.thrift_root,
        source = ctx.attr.thrift[ThriftInfo].src,
        transitive_sources = ctx.attr.thrift[ThriftInfo].transitive_srcs.to_list(),
        importpath = ctx.attr.importpath,
        package = ctx.attr.package,
    )

    library = go.new_library(
        go,
        resolver = _resolve,
        srcs = go_srcs,
    )
    source = go.library_to_source(go, ctx.attr, library, ctx.coverage_instrumented())
    archive = go.archive(go, source)

    return [
        library,
        source,
        archive,
        DefaultInfo(
            files = depset([archive.data.file]),
            runfiles = archive.runfiles,
        ),
        OutputGroupInfo(
            compilation_outputs = [archive.data.file],
            go_generated_srcs = go_srcs,
        ),
    ]

_go_thriftrw_library = rule(
    implementation = _go_thriftrw_library_impl,
    attrs = {
        "deps": attr.label_list(
            providers = [GoLibrary],
        ),
        "thrift": attr.label(
            mandatory = True,
            providers = [ThriftInfo],
        ),
        "importpath": attr.string(),
        "package": attr.string(),
        "plugins": attr.label_list(
            providers = [GoThriftRWPlugin],
            default = ["//rules/thrift:go_thriftrw_yarpc"],
        ),
        "thrift_root": attr.string(default = "idl"),
        "thriftrw": attr.label(
            executable = True,
            cfg = "exec",
            default = "@org_uber_go_thriftrw//:thriftrw",
        ),
        "_go_context_data": attr.label(default = "@io_bazel_rules_go//:go_context_data"),
        "_default_deps": attr.label_list(
            default = [
                "@org_uber_go_thriftrw//protocol/binary:go_default_library",
                "@org_uber_go_thriftrw//protocol/stream:go_default_library",
            ],
        ),
    },
    toolchains = ["@io_bazel_rules_go//go:toolchain"],
)

def go_thriftrw_library(name, **kwargs):
    if "_thriftrw" not in kwargs:
        kwargs["_thriftrw"] = Label("@org_uber_go_thriftrw//:thriftrw")
    thriftrw = kwargs.pop("_thriftrw")

    reset_plugin_name = name + "_reset_plugin_"
    go_reset_target(
        name = reset_plugin_name,
        dep = thriftrw,
        visibility = ["//visibility:private"],
    )
    _go_thriftrw_library(
        name = name,
        thriftrw = reset_plugin_name,
        **kwargs
    )

def _go_thriftrw_plugin_impl(ctx):
    thriftrw_plugin_prefix = "thriftrw-plugin-"
    plugin_basename = ctx.executable.plugin.basename
    plugin_name_index = plugin_basename.rfind(thriftrw_plugin_prefix)
    plugin_name = plugin_basename[plugin_name_index + len(thriftrw_plugin_prefix):]
    return [
        GoThriftRWPlugin(
            plugin_deps = ctx.attr.plugin_deps,
            plugin_name = plugin_name,
            plugin = ctx.executable.plugin,
            expected_pkg_files = ctx.attr.expected_pkg_files,
            plugin_flags = ctx.attr.plugin_flags,
        ),
    ]

_go_thriftrw_plugin = rule(
    implementation = _go_thriftrw_plugin_impl,
    attrs = {
        "expected_pkg_files": attr.string_list_dict(doc = "A map of package types to expected file outputs from code generation. Example: {'client': ['{service_name}client/client.go']}"),
        "plugin_deps": attr.label_list(providers = [GoLibrary]),
        "plugin": attr.label(
            allow_single_file = True,
            executable = True,
            cfg = "exec",
        ),
        "plugin_flags": attr.string(doc = "An optional flag string the plugin can be invoked with"),
        "_go_context_data": attr.label(default = "@io_bazel_rules_go//:go_context_data"),
    },
    toolchains = ["@io_bazel_rules_go//go:toolchain"],
)

def go_thriftrw_plugin(name, **kwargs):
    plugin = kwargs.pop("plugin")
    reset_plugin_name = name + "_reset_plugin_"
    go_reset_target(
        name = reset_plugin_name,
        dep = plugin,
        visibility = ["//visibility:private"],
    )
    _go_thriftrw_plugin(
        name = name,
        plugin = reset_plugin_name,
        **kwargs
    )

I am slightly concerned about all of these breaking changes here.

tyler-french avatar Aug 26 '24 15:08 tyler-french

@tyler-french Can you try replacing your _resolve function with the following?

def _resolve(go, attr, source, merge):
    plugin_deps = []
    for c in attr.plugins:
        plugin_deps.extend(c[GoThriftRWPlugin].plugin_deps)
    source["deps"] += [dep[GoArchive] for dep in plugin_deps + attr._default_deps]

Your concern is fair but i think the list is a bit pedantic and a lot of the changes are to providers that are already declared to be non-public API in the docs. In practice the actual changes that I think may cause some snags:

  1. The SDK depsets
  2. The deps becoming GoArchive
  3. go.mode.link becoming go.mode.linkmode

I think we can smooth out (2) by fixing up source["deps"] after calling the resolver inside the rules, but let's confirm that this fixes things for you first and whether there are other issues

dzbarsky avatar Aug 26 '24 15:08 dzbarsky

I also like the release plan. We should probably fix https://github.com/bazelbuild/rules_go/issues/4062 for that release, I'll look into it tomorrow.

The breaking changes David listed are also the three I consider most relevant. As we will document them in the release notes, I think that migration won't be too problematic given how much users can gain.

@dzbarsky Do you have before/after CPU/memory data for your series if patches?

fmeum avatar Aug 26 '24 20:08 fmeum

Sorry, hitting another failure here:

 % bazel build --nobuild //...
INFO: Writing tracer profile to '/tmp/bazel_20240827160435_o7sby'
INFO: Invocation ID: b049c579-1db2-41b8-a63c-157e0b994068
ERROR: /home/user/go-code/src/../BUILD.bazel:27:14: in _glue_generate rule //src/...:glue_gateway: 
Traceback (most recent call last):
        File "/home/user/go-code/rules/glue/glue.bzl", line 107, column 34, in _glue_generate_impl
                source = go.library_to_source(go, ctx.attr, library, ctx.coverage_instrumented())
        File "/home/user/.cache/bazel/_bazel_tfrench/b97476d719d716accead0f2d5b93104f/external/io_bazel_rules_go/go/private/context.bzl", line 312, column 24, in _library_to_source
                library.resolve(go, attr, source, _merge_embed)
        File "/home/user/go-code/rules/glue/glue.bzl", line 8, column 20, in _resolve
                tdeps = dep[GoSource].deps
Error: type 'struct' has no operator [](Provider)
ERROR: /home/user/go-code/..../BUILD.bazel:27:14: Analysis of target '//src/...:glue_gateway' failed
Analyzing: 486068 targets (21318 packages loaded, 12690 targets configured)
FATAL: bazel ran out of memory and crashed. Printing stack trace:
net.starlark.java.eval.Starlark$UncheckedEvalError: OutOfMemoryError thrown during Starlark evaluation (//idl/code.uber.internal/rds/hubbub:glue_mock)
        at <starlark>.print(<builtin>:0)
        at <starlark>._resolve(/home/user/go-code/rules/glue/mock.bzl:13)
        at <starlark>._library_to_source(/home/user/.cache/bazel/_bazel_tfrench/b97476d719d716accead0f2d5b93104f/external/io_bazel_rules_go/go/private/context.bzl:312)
        at <starlark>._glue_generate_mock_impl(/home/user/go-code/rules/glue/mock.bzl:64)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at java.base/java.util.Arrays.copyOf(Unknown Source)
        at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
        at java.base/java.lang.AbstractStringBuilder.append(Unknown Source)
        at java.base/java.lang.StringBuilder.append(Unknown Source)
        at java.base/java.lang.StringBuilder.append(Unknown Source)
        at java.base/java.lang.AbstractStringBuilder.append(Unknown Source)
        at java.base/java.lang.StringBuilder.append(Unknown Source)
        at net.starlark.java.eval.Printer.append(Printer.java:60)
        at com.google.devtools.build.lib.collect.nestedset.Depset.repr(Depset.java:305)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at net.starlark.java.eval.Printer.printList(Printer.java:101)
        at net.starlark.java.eval.StarlarkList.repr(StarlarkList.java:249)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at net.starlark.java.eval.Printer.printList(Printer.java:101)
        at net.starlark.java.eval.StarlarkList.repr(StarlarkList.java:249)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at net.starlark.java.eval.Printer.printList(Printer.java:101)
        at net.starlark.java.eval.StarlarkList.repr(StarlarkList.java:249)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at net.starlark.java.eval.Printer.printList(Printer.java:101)
WARNING: Waiting for server process to terminate (waited 5 seconds, waiting at most 60)

This is the resolve function here:

def _resolve(go, attr, source, merge):
    libdeps = []
    lib = attr.library
    if GoSource in lib:
        libdeps = lib[GoSource].deps

    transitive = []
    for libdep in libdeps:
        tdeps = libdep[GoSource].deps
        transitive += tdeps

    source["deps"] = source["deps"] + transitive + libdeps + attr._generated_mock_deps

tyler-french avatar Aug 27 '24 16:08 tyler-french

@tyler-french No problem, thanks for iterating with me on this :)

If I'm reading correctly, libdeps are a list of GoArchive, so you probably want to change it like so:

-        tdeps = libdep[GoSource].deps
+        tdeps = libdep.source.deps

does that work?

dzbarsky avatar Aug 27 '24 17:08 dzbarsky

@fmeum shall we merge this? Are we waiting for anything else?

dzbarsky avatar Nov 16 '24 04:11 dzbarsky

@fmeum shall we merge this? Are we waiting for anything else?

Yes and no, just me :-)

fmeum avatar Nov 18 '24 15:11 fmeum