Non-deterministic file_..._proto_rawDesc byte array generation for messages using custom options (e.g., gen_bq_schema.bq_table)
What version of protobuf and what language are you using?
Go protobuf runtime: google.golang.org/protobuf v1.36.6
protoc-gen-go plugin: buf.build/protocolbuffers/go:v1.36.5
protoc-gen-bq-schema plugin: github.com/GoogleCloudPlatform/protoc-gen-bq-schema/v3 v3.1.0
buf CLI version: 1.55.1
OS: Mac darwin arm64
Go version: go1.24.4 darwin/arm64
What did you do?
We are using buf generate to generate Go protobuf code (.pb.go files) from our .proto definitions. Our setup includes a custom message option, (gen_bq_schema.bq_table), provided by the buf.build/googlecloudplatform/bq-schema plugin. This option is applied to multiple messages in our schemas.
We execute buf generate multiple times on an identical set of input .proto files within a fully isolated environment (using a temporary, unique BUF_HOME directory for each run).
Here is an example:
buf.yaml:
version: v1
buf.gen.yaml:
version: v2
plugins:
- remote: buf.build/protocolbuffers/go:v1.36.5 # Plugin version
out: .
opt: paths=source_relative
- remote: buf.build/googlecloudplatform/bq-schema:v3.1.0 # Plugin version
out: .
./test/v0/test.proto:
syntax = "proto3";
package test.v0;
import "gen_bq_schema/bq_table.proto";
import "google/protobuf/timestamp.proto";
import "google/protobuf/struct.proto";
message TestMessage {
option (gen_bq_schema.bq_table) = {
table_name: "golden__order_group__contract_in_waiting"
};
// Add some common fields that might be present in affected messages
string id = 1;
google.protobuf.Timestamp event_time = 2;
google.protobuf.Struct details = 3;
optional string optional_field = 4; // Example of an optional field
}
The run the Go code generation tool over and over and you'll see different results. Try something like:
# Create isolated temporary directories for each run
mkdir -p .tmp_out/run1 .tmp_out/run2
mkdir -p .tmp_buf_home_1 .tmp_buf_home_2
# Run buf generate for the first time
BUF_HOME="$(pwd)/.tmp_buf_home_1" buf generate --config buf.yaml --template buf.gen.yaml test --output .tmp_out/run1/
# Run buf generate for the second time
BUF_HOME="$(pwd)/.tmp_buf_home_2" buf generate --config buf.yaml --template buf.gen.yaml test --output .tmp_out/run2/
# Compare the generated files
diff -u .tmp_out/run1/test/v0/test.pb.go .tmp_out/run2/test/v0/test.pb.go
As you run a diff across those different runs, you'll see that the bytestring for the descriptor will change slightly.
What did you expect to see?
I expected bit-for-bit identical
What did you see instead?
The generated Go file test.pb.go is non-deterministic between runs. The only differences observed are consistently within the file_test_v0_test_proto_rawDesc byte array.
Specifically, a snippet from two different runs shows a difference in the ordering of two sub-fields within an embedded message that represents a Protobuf extension:
(The following is from Gemini... I had no easy way to validate since I wasn't going to check hex values.)
File 1 Snippet (from test.pb.go run1):
var file_test_v0_test_proto_rawDesc = string([]byte{
// ... (preceding identical bytes) ...
0xa2, 0x51, 0x04, 0x10, 0x01, 0x08, 0x01, 0x62, 0x06, 0x70, 0x72, 0x6f, 0x74, 0x6f, 0x33,
})
File 2 Snippet (from test.pb.go run2, showing the difference):
var file_test_v0_test_proto_rawDesc = string([]byte{
// ... (preceding identical bytes) ...
0xa2, 0x51, 0x04, 0x08, 0x01, 0x10, 0x01, 0x62, 0x06, 0x70, 0x72, 0x6f, 0x74, 0x6f, 0x33,
})
Analysis of the difference:
The 0xa2, 0x51 tag decodes to a protobuf field number 1300 with wire type 2 (length-delimited). This is characteristic of an extension field to a standard protobuf message (in this case, google.protobuf.MessageOptions).
The 0x04 indicates that the following 4 bytes are the content of this length-delimited field.
The difference lies in the order of 0x10, 0x01 and 0x08, 0x01.
0x10, 0x01 decodes to: Protobuf Field 2 (varint type), with value 1.
0x08, 0x01 decodes to: Protobuf Field 1 (varint type), with value 1.
Therefore, the issue is that two inner fields (Field 1 and Field 2, both with value 1) of the gen_bq_schema.bq_table extension are being serialized in a different order by protoc-gen-go between runs.
Anything else we should know about your project / environment?
This non-determinism only occurs on a subset of our messages that utilize the (gen_bq_schema.bq_table) extension. Many other messages using the same (gen_bq_schema.bq_table) option, generated in the same overall buf generate command, produce fully deterministic rawDesc byte arrays.
This suggests that the non-determinism might be triggered by a specific interaction with other fields, options (e.g., specific table_name string content), or the overall complexity of the .proto message definition that these particular messages possess.
We ensure the .proto input files are byte-for-byte identical for each buf generate run by using source control and checksums.
The BUF_HOME environment variable is set to a temporary, unique directory for each buf generate run to ensure that no cached artifacts or previous build states affect the new generation.
Hey @lazarillo! It’s certainly unexpected to encounter non-determinism in these descriptors 🤔
What are the minimal steps to reproduce this issue without Buf, just with Go Protobuf? (I’m not familiar with Buf’s code.)
Thanks!
I figured it out a bit further, and some of what I wrote above is wrong. Here are the updates:
- It is not coming from the
gen-bq-schemaplugin that I thought. It is instead coming from a plugin that our team created, but which we provide publicly here- That package is unrelated to Go: it's actually to adjust how
protocgenerates Python code, because it does not create modules that can easily act as a package, and we typically see and share our collection of proto messages as a package. - I think the issue might be that the package does not have a
go.mod+go.sum+ generated Go code. We were trying to make that package generic, and it seemed weird that we would need to explicitly do something for Go and not for other languages. But, it seems that we do, and we'll update the package to include those elements.
- That package is unrelated to Go: it's actually to adjust how
- But it still seems like the issue lies within
protoc-gen-go, because if what I wrote above is correct, then this means that code for thegoogle.protobuf.fieldOptions(and/orgoogle.protobuf.MessageOptions) are not being generated consistently within Go code generation. I understand that if we keep pointing to the same compiled code, then it isn't / shouldn't be an issue, but it still means that Go code generation is not deterministic.- Here is some feedback I had from Gemini when trying to debug. In it, I share the bytes that got swapped, so that you can see that.
As far as a reproducible example without Buf, basically swap out the buf generate for protoc. First, let me fix the toy example, since it is not gen-bq-schema, but protoc-gen-py-pkg`:
syntax = "proto3";
package test.v0;
import "py_package/.proto";
import "google/protobuf/timestamp.proto";
import "google/protobuf/struct.proto";
option (gen_py_pkg.py_package_opts).enable = true;
option (gen_py_pkg.py_package_opts).enable_top_level_imports = true;
message TestMessage {
// Add some common fields that might be present in affected messages
string id = 1;
google.protobuf.Timestamp event_time = 2;
google.protobuf.Struct details = 3;
optional string optional_field = 4; // Example of an optional field
}
Then try to compile the go code with the standard protoc --go_out . test/v0/test.proto like you normally would. (I'm not sure if the flag is --go_out... I only use Buf. But I'm sure you know it.
I took the byte slices from your linked Google Doc and put them into two separate Go files, one.go and two.go, with the following header:
package main
import "os"
func main() {
os.Stdout.Write(file_datacontractorder_v0_order_group_read_proto_rawDesc)
}
var file_datacontractorder_v0_order_group_read_proto_rawDesc = []byte{
0x0a, 0x2b, 0x64, 0x61, 0x74, 0x61, 0x63, 0x6f, 0x6e, 0x74, 0x72, 0x61, 0x63, 0x74, 0x6f, 0x72,
[…]
Then, I dumped the raw protobuf wire format to a file:
% go run one.go > one.bin
% go run two.go > two.bin
Using the https://github.com/protocolbuffers/protoscope tool, we can see how the wire format encoding differs:
% diff -u <(protoscope one.bin) <(protoscope two.bin)
--- /proc/self/fd/11 2025-07-24 11:29:27.903454322 +0200
+++ /proc/self/fd/15 2025-07-24 11:29:27.903454322 +0200
@@ -108,8 +108,8 @@
44: {"Datacontractorder\\V0\\GPBMetadata"}
45: {"Datacontractorder::V0"}
1300: {
- 2: 1
1: 1
+ 2: 1
}
}
12: {"proto3"}
OK, so what is field 1300? The descriptor message is of type google.protobuf.FileDescriptorProto, so let’s decode it:
% protoc --decode=google.protobuf.FileDescriptorProto ~/Downloads/protoc-31.1/include/google/protobuf/descriptor.proto < one.bin
[…]
options {
java_package: "com.datacontractorder.v0"
java_outer_classname: "OrderGroupReadProto"
java_multiple_files: true
go_package: "github.com/cellpointdigital/phalanx-insights/libraries/data-contracts/order/pkg/gen/datacontractorder/v0;datacontractorder"
objc_class_prefix: "DVX"
csharp_namespace: "Datacontractorder.V0"
php_namespace: "Datacontractorder\\V0"
php_metadata_namespace: "Datacontractorder\\V0\\GPBMetadata"
ruby_package: "Datacontractorder::V0"
1300 {
2: 1
1: 1
}
}
syntax: "proto3"
Aha! So at this point, my hypothesis would be that custom file options are not encoded deterministically.
But, looking at the Go Protobuf code, we do specify deterministic encoding (and have done so for the last 6 years):
https://github.com/protocolbuffers/protobuf-go/blob/8e8926ef675d99b1c9612f5d008f4dc803839f7a/cmd/protoc-gen-go/internal_gengo/reflect.go#L240
In order to further track down this source of non-determinism, I tried reproducing as follows:
repro.proto contains:
syntax = "proto3";
package test.v0;
option go_package = "github.com/golang/protobuf/issues/1692";
import "py_package.proto";
import "google/protobuf/timestamp.proto";
import "google/protobuf/struct.proto";
option (py_package.py_package_opts).enable = true;
option (py_package.py_package_opts).enable_top_level_imports = true;
message TestMessage {
// Add some common fields that might be present in affected messages
string id = 1;
google.protobuf.Timestamp event_time = 2;
google.protobuf.Struct details = 3;
optional string optional_field = 4; // Example of an optional field
}
py_package.proto contains:
syntax = "proto3";
package py_package;
import "google/protobuf/descriptor.proto";
option go_package = "github.com/stephenoken/protoc-gen-py-pkg/protos";
extend google.protobuf.FileOptions {
// Python package options.
PyPackageOptions py_package_opts = 1300;
}
message PyPackageOptions {
// Enable Python package generation.
bool enable = 1;
// If true, imports will be generated at the top level of the package.
bool enable_top_level_imports = 2;
}
Running protoc --go_out=. ./repro.proto works for me and produces ./github.com/golang/protobuf/issues/1692/repro.pb.go.
I ran this command in a loop for 10_000 times, comparing the newly generated file with the previous version:
% for i in $(seq 10000); do mv ./github.com/golang/protobuf/issues/1692/repro.pb.go /tmp/old.pb.go; protoc --go_out=. ./repro.proto; diff -u /tmp/old.pb.go ./github.com/golang/protobuf/issues/1692/repro.pb.go; done
%
As you can see, there are no printed diffs, so on my machine, this command is deterministic.
I wonder how your environment results in non-deterministic output. Can you tell me how to make my reproducer trigger the issue? Can you run this reproducer in your environment to see if it’s also affected, or if there are other triggering conditions that we are missing? What rate of non-determinism are you seeing? (Like, 1 in 100, or 1 in 10_000, or…)
I also could not repro through protoc alone… I’m starting to think maybe Buf doesn’t call into protoc to make the actually call to protoc-gen-go but is parsing the proto itself?
https://github.com/bufbuild/buf/blob/d3d9b8647b201c740a0cce5cffd69e7d03d82cee/private/buf/cmd/buf/command/alpha/protoc/protoc.go
This command replaces protoc using Buf's internal compiler.
The implementation is in progress. Although it outperforms mainline protoc, it hasn't yet been optimized.
This looks likely. If we’re getting a non-deterministic descriptor from buf, then we cannot help ourselves but be non-deterministic?
I was working on getting Buf up so I could do a repro through the first directions, but I have family visiting, and cannot poke at it right now.