protobuf icon indicating copy to clipboard operation
protobuf copied to clipboard

Couldn't build proto file into descriptor pool: duplicate file name when using upb python implementation

Open Atheuz opened this issue 2 years ago • 0 comments

What version of protobuf and what language are you using? Version: 4.21.12 Language: Python

What operating system (Linux, Windows, ...) and version?

What runtime / compiler are you using (e.g., python version or gcc version) python: 3.10 buf: 1.11.0 libprotoc: 3.21.12

What did you do? Steps to reproduce the behavior:

  1. Go to https://github.com/Atheuz/test-protobuf-schema-error
  2. Clone the repo
  3. Cd into the repo directory
  4. Create a virtualenv: virtualenv .venv --python=3.10
  5. Activate the virtualenv: source .venv/bin/activate
  6. Install dependencies: pip install -r requirements.txt
  7. Run pytest: pytest
  8. See that when PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=upb, pytest fails to run.
  9. Change PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=upb to PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python in pytest.ini.
  10. Run pytest: pytest
  11. See that when PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python, pytest runs without issue.
  12. Similarly, if you downgrade to protobuf==3.20.3 and run the test using PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp or PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python, it also succeeds: Only PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=upb which became available in 4.x fails.

What did you expect to see

I expected the behaviour for upb to be the same as the python implementation. I.e. it works without issue.

What did you see instead?

The basic error that I get is: TypeError: Couldn't build proto file into descriptor pool: duplicate file name (google/protobuf/descriptor.proto)

More detail can be seen in the error.txt file here: https://github.com/Atheuz/test-protobuf-schema-error/blob/master/error.txt

Anything else we should know about your project / environment

What's happening is that we are going into common/terms_pb2.py and manually replacing from google.protobuf import descriptor_pb2 as google_dot_protobuf_dot_descriptor__pb2 with from google_test.protobuf import descriptor_pb2 as google_dot_protobuf_dot_descriptor__pb2.

Note that we are not doing this replacement in common/options_pb2.py, having this conflict of 2 different imports is necessary for the error to appear: i.e. if we do the replacement in common/options_pb2.py, then the error disappears and everything works fine, but this is not what we're doing on our end, in that we only do the replacement in one file and not all files.

The descriptor_pb2.py file was generated using buf version 1.11.0 and protoc version 3.21.12 using the following commands:

cd build
buf generate --config=buf.yaml --template=buf.gen.yaml
cp google/protobuf/descriptor_pb2.py ../google_test/protobuf/descriptor_pb2.py

The reason our organization does this replacement in the common/terms_pb2.py file is related to the Confluent Kafka Schema Registry, where apparently if we don't do this replacement the schema doesn't match what we have in our Confluent Kafka Schema Registry.

Note that it works fine in protobuf 3.20.3 using the cpp implementation, and the python implementation. It also works fine in protobuf 4.21.12 using the python implementation. The only implementation that has an issue with this edit is the upb implementation.

Atheuz avatar Jan 09 '23 12:01 Atheuz