buf icon indicating copy to clipboard operation
buf copied to clipboard

Buf CLI hangs indefintely

Open antspy opened this issue 7 months ago • 5 comments

Hi,

I am having problem with my buf cli (verison 1.54.0) I have the following buf.yaml file:

version: v2
modules:
  - path: src
lint:
  use:
    - STANDARD

This works as expect, and running buf lint src/foo/bar.proto works correctly. However, I want the 'base' dir to be the workspace root dir, not src [0]. So I modified as such:

version: v2
modules:
  - path: .
lint:
  use:
    - STANDARD

Now calling buf lint src/foo/bar.proto hangs forever (5min +). My guess is that buf is trying to scan every folder underneath the root, which contains things like node_modules, python virtual environments, bazel output dir, etc. etc. Ok, so I tried using the includes property:

version: v2
modules:
  - path: .
    includes:
       - src 
lint:
  use:
    - STANDARD

This has two problems:

  1. The yaml configuration file does not recognize the includes property (excludes is available, but not includes, for some reason). This might be a red herring, as it might simply be the editor not recognizing the right format for the yaml file. [1]
  2. The bigger issue is that buf lint src/foo/bar.proto hangs once again. It looks like the includes property is being ignored.

Do you have any ideas what could be happening, or how could I debug further?

Thanks!


Edit

Turning on the --debug option, I see the following:

DEBUG   github.com/bufbuild/buf/private/buf/bufworkspace.(*workspaceProvider).getWorkspaceForBucketBufYAMLV2    {"duration": "7.433µs"}
DEBUG   github.com/bufbuild/buf/private/buf/bufctl.(*controller).getWorkspaceForProtoFileRef    {"duration": "378.889µs"}
DEBUG   building image for target module        {"moduleOpaqueID": ".", "moduleDescription": "path: \".\"", includes: \"src\""}}

So it looks like the buf cli is trying to build an image, but this is taking a while (5m+)

[0] I am using bazel, and bazel understands imports as starting from the root dir. So if I have to import src/foo/bar2.proto, I need to use src in the path. But buf would not recognize that path if I set path: src in the buf.yaml config.

[1] In fact, the published schema file (https://json.schemastore.org/buf.json) does not have an includes property for the v2 version. This seems like a mistake?

antspy avatar May 18 '25 17:05 antspy

I posted a response earlier, but went through your footnotes and I see that your intended proto root is the root of your source repository. For the includes key, we'll need to update the jsonschema, so I can look into that.

The part about includes being ignored -- just to double check for debugging, what happens if you just run buf lint, without specifying buf lint src/foo/bar.proto? Specifying a file as an input in this way has special properties, so I want to first check the behaviour for just buf lint.

doriable avatar May 20 '25 21:05 doriable

Thank you very much for looking into this, it's much appreciated!

To clarify the structure, it would be something like this:

buf.yaml 
node_modules/
bazel-bin/
python_venv/
src
  foo
    bar.proto
  core
    utils 
      type.proto
  ...

Basically .proto files can be anywhere under the src folder. So I want the root to be the workspace / bazel root, and select in includes the src folder only.

To answer your question, I have run buf lint --debug (the buf.yaml was configured with path:. and includes:[src]). The output is:

DEBUG   buffetch termination found      {"curDirPath": "absolute/repo_path", "path": "absolute/repo_path"}
DEBUG   buffetch termination found      {"curDirPath": ".", "path": "."}
DEBUG   targeting workspace based on v2 buf.yaml        {"subDirPath": "."}
DEBUG   github.com/bufbuild/buf/private/buf/bufworkspace.(*workspaceProvider).getWorkspaceForBucketBufYAMLV2    {"duration": "6.102µs"}
DEBUG   github.com/bufbuild/buf/private/buf/bufctl.(*controller).getWorkspaceForSourceRef       {"duration": "411.592µs"}
DEBUG   building image for target module        {"moduleOpaqueID": ".", "moduleDescription": "path: \".\", includes: \"src\""}
DEBUG   github.com/bufbuild/buf/private/bufpkg/bufimage.BuildImage      {"duration": "32.126585614s"}
DEBUG   github.com/bufbuild/buf/private/bufpkg/bufcheck.(*multiClient).ListRulesAndCategories   {"duration": "80.339587ms"}
DEBUG   rulesConfig     {"ruleIDs": ["DIRECTORY_SAME_PACKAGE", "ENUM_FIRST_VALUE_ZERO", "ENUM_NO_ALLOW_ALIAS", "ENUM_PASCAL_CASE", "ENUM_VALUE_PREFIX", "ENUM_VALUE_UPPER_SNAKE_CASE", "ENUM_ZERO_VALUE_SUFFIX", "FIELD_LOWER_SNAKE_CASE", "FIELD_NOT_REQUIRED", "FILE_LOWER_SNAKE_CASE", "IMPORT_NO_PUBLIC", "IMPORT_USED", "MESSAGE_PASCAL_CASE", "ONEOF_LOWER_SNAKE_CASE", "PACKAGE_DEFINED", "PACKAGE_DIRECTORY_MATCH", "PACKAGE_LOWER_SNAKE_CASE", "PACKAGE_NO_IMPORT_CYCLE", "PACKAGE_SAME_CSHARP_NAMESPACE", "PACKAGE_SAME_DIRECTORY", "PACKAGE_SAME_GO_PACKAGE", "PACKAGE_SAME_JAVA_MULTIPLE_FILES", "PACKAGE_SAME_JAVA_PACKAGE", "PACKAGE_SAME_PHP_NAMESPACE", "PACKAGE_SAME_RUBY_PACKAGE", "PACKAGE_SAME_SWIFT_PREFIX", "PROTOVALIDATE", "RPC_PASCAL_CASE", "RPC_REQUEST_RESPONSE_UNIQUE", "RPC_RESPONSE_STANDARD_NAME", "SERVICE_PASCAL_CASE", "SERVICE_SUFFIX", "SYNTAX_SPECIFIED"]}
DEBUG   github.com/bufbuild/buf/private/bufpkg/bufcheck.(*multiClient).Check    {"duration": "116.068µs"}
DEBUG   github.com/bufbuild/buf/private/pkg/thread.Parallelize  {"duration": "72.037793ms", "plugin": ""}
DEBUG   github.com/bufbuild/buf/private/buf/cmd/buf/command/lint.run    {"duration": "153.451894ms"}

The command hangs after the DEBUG building image for target module {"moduleOpaqueID": ".", "moduleDescription": "path: \".\", includes: \"src\""} line. The next line in the output suggests this took 32s, but it's more like 2 min in my experience (timed it using time buf lint --debug).


Finally, there is one more point of interest - the Buf VSCode extension had the following output:

Failure: Module "path: ".", includes: "src"" had no .proto files

It looks like it does not recursively look for proto files inside subdirectories. I am not sure if this is a separate issue of the VSCode extension or it's part of the same issue with the CLI we've been seeing.

antspy avatar May 21 '25 07:05 antspy

Thank you for the response! I'm going to do some investigation on some potential improvements we can make on the build path, the size/repository setup is definitely the contributing factor here.

As for the VSCode extension, normally issues for the extension are filed on https://github.com/bufbuild/vscode-buf/issues but I am also the person looking into those right now -- what version of the extension do you happen to be on? We addressed a concern that I believe is related to this in the latest release, 0.7.2 -- if that version doesn't work for you, feel free to open an issue on the extension with more details and I'll look into it!

doriable avatar May 21 '25 15:05 doriable

Thanks a lot!

Yes, I am on the latest release. I have opened an issue in that repository. Thank you!

antspy avatar May 21 '25 15:05 antspy

Thank you @emcfarlane for putting up a PR for adding the includes key to the JSON schema https://github.com/SchemaStore/schemastore/pull/4753

doriable avatar Jun 02 '25 15:06 doriable

Hi,

Gentle ping :) Is there any progress on this issue?

antspy avatar Jul 03 '25 09:07 antspy

Gentle ping :) Is there any progress on this issue?

Hello! So for transparency, we are currently tackling working on issues one layer at a time, starting with the editor integrations (vscode extension), then lint/breaking performance issues, and then working our way towards the build layer, so we currently have not prioritized this effort yet.

doriable avatar Jul 03 '25 22:07 doriable

Got it, thank you! Will keep an eye out on this :)

antspy avatar Jul 04 '25 15:07 antspy

Whenever a file is opened/changed/saved there is a call chain that looks something like file.Refresh() -> IndexImports() -> findImportable() -> GetWorkspace() then for each module in workspace module.WalkFileInfos().

WalkFileInfos() implementation calls Walk on the readBucket. readBuckets are composable and wrap each other. The way this works is that the bottom-most readBucket actually implements the FS walk on the module path, then results are filtered up through the wrapping buckets and they then apply their filters (Matchers - for path filtering like includes/excludes, Mappers for path transformation, the stripReadBucket removes path components, etc.).

The fundamental issue here is that the bottom-most bucket traverses the entire module path, not taking into account the include list, because that is a filter that exists further up the stack. This needs to be refactored somehow such that if includes are present only they are walked, as per the documentation: includes "Lists directories within this directory to include in Protobuf file discovery. Only directories added to this list are included in Buf operations."

I haven't thought of a clean way to do this yet, but if I come up with one I'll submit a PR.

siler avatar Sep 17 '25 00:09 siler

Hello, looking into this! We had some similar issues in https://github.com/bufbuild/buf/issues/3219 . A workaround for now maybe is to specify the --path arg on buf build . --path=src.

emcfarlane avatar Sep 19 '25 16:09 emcfarlane