protobuf
protobuf copied to clipboard
Python: support for optional python_package (or similar) to make import explicit
What language does this apply to?
Python.
Describe the problem you are trying to solve.
The Python code generator uses the directory structure of the .proto files to generate the generated Python code directory layout and imports (e.g. when Protobuf file A imports Protobuf file B). It doesn't sanitize Python reserved keywords.
This can be problematic in two cases:
-
The directory structure contains a reserved keyword. For example when it contains
as. In pythonimport asfails but alsoimport as.foo.baris invalid. -
When in case of A depending on B, A will contain an absolute import that might not be on the Python path (e.g. when the generated code is packaged so instead of
import bit should doimport mypackage.b). This was mentioned as a fix in the discussions: https://github.com/protocolbuffers/protobuf/issues/1491, but I would consider this a workaround.
Describe the solution you'd like
For several languages, there are ways to make the package / module name explicit by defining an option, e.g. option go_package, option java_package, ... (https://developers.google.com/protocol-buffers/docs/proto3#packages). (I believe there is also a ruby_package https://github.com/protocolbuffers/protobuf/pull/4627).
I believe an optional option python_package could solve the above issues as a as/api.proto file then could contain an option python_package = "as_pb.api" (or "my_package.as_pb.api").
The Protobuf code generator will then take the python_package into account when:
- Generating the directory structure for the generated Python code so that it can be imported as
import as_pb.api(as_pb/api.py). - When defining imports (Protobuf file A depends on B). The generated file A then imports B using the (optional) defined
python_package.
Describe alternatives you've considered
My initial approach was to re-define the structure of the original Protobuf files. However, when the Protobuf files are used to generate other languages bindings, this has side-effects (e.g. the Go code now ends up in as_pb/api, it should be in as/api.
I believe the supported languages should not be leading in how the Protobuf structure looks like.
Additional context
Explicit is better than implicit.
(https://www.python.org/dev/peps/pep-0020/)
:wink:
See also https://github.com/brocaar/chirpstack-api/issues/5 .
I this this is a duplicate of #2283 I also really need it. For now i think we have todo rewriting of the imports after the code has been generated.
I face the same issue where the compiled proto files generated have the wrong import statements.
Very old problem and no fix expected
I'm also interested in this.
Hi, Any progress on this issue?
same problem here. 🤦♂️
same problem
We also have the same problem...
Is there some automated viable workaround? It seems like a necessity to conform to language guidelines and restrictions.
Instead of renaming hyphen to underscore automagically, I would vote it be explicit. I'm seeing it break down with other tools, i.e. bazel, and am having to fork and rename packages.
@daicoden at the moment the only "automated" workaround I'm using is some bash vodoo to edit what I need after compiling the protobufs. Something along this:
sed -i 's/^import \(.\+\) as/from . import \1 as/' some/pkg/protobufs/*.py
This code works for me for relative imports in the generated package Create a file python2to3.py and add following code in it. Then run the script from the generated proto project root.
import os
path = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
path = os.path.join(path, 'src') # package name where generated files are reside
file_path = f'{path}/*_pb2*.py*'
os.system(f'2to3 -wn -f import {file_path}')
Another solution could be to have an option to generate all Python code in a single module, as this would solve the "absolute imports between modules" issue, that I personally solve using the sed trick mentioned above.
Likewise, this is a challenge. Without firm convention and support here, interoperability within our greatly multi-lingual system is strained by competing conventions and workarounds. Editing the file that boldly says "Do Not Edit" asks for trouble.
I'm trying to work with https://github.com/cloudevents/spec/blob/3da5643ebceb39637406a7e30903dbac81cf92d2/cloudevents/formats/cloudevents.proto
package io.cloudevents.v1;
// ...
message CloudEvent {
// ...
}
// ...
When using protoc this puts python code in io.cloudevents.v1.cloudevents_pb2 and assumes this location for imports from other python modules.
This then conflicts with the python builtin io module AFAICT: https://docs.python.org/3/library/io.html
It would be great to have some solution to this.
I setup an example illustrating this problem https://github.com/aucampia/issue-20221116-python_protobuf_cloudevents/tree/da10645893a3085c5bd76642f12aef67a222266a
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.
This issue is labeled inactive because the last activity was over 90 days ago.
This issue is still relevant and needs a solution.
Another way to approach this is to allow customizing the module prefix. This makes generated code rules more useful since adding a module prefix is a common way to ensure that there's no module conflicts.
The module name is parsed by this helper: https://github.com/protocolbuffers/protobuf/blob/643b0a625f36faed26ccca3e92a57b6b74d7e91b/src/google/protobuf/compiler/python/helpers.cc#L31-L35
I suggested a similar fix on grpclib, see https://github.com/vmagamedov/grpclib/pull/196 .
Update: I made a PR to support a custom module prefix in the Python imports for protobuf but I'm not familiar with the edge cases of such a feature. I also didn't see a lot of tests for the python plugin and import behavior so I can't easily tell if the approach works for usages. It could use a pair of eyes:
https://github.com/protocolbuffers/protobuf/pull/17286
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.
This issue is labeled inactive because the last activity was over 90 days ago. This issue will be closed and archived after 14 additional days without activity.
@github-actions boop
any news?
I'm way late to the party it seems but I too find my self encountering this issue this morning and am quite surprised that it's still an unresolved issue (seems to me like https://github.com/protocolbuffers/protobuf/issues/2283 wasn't actually resolved, despite being closed).
While I'm completely ignorant of the consequences and level of difficulty involved in supporting an analogous developer experience to what you get in go (and apparently other languages; e.g., java), it would be a huge improvement to have consistency here.
+1
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.
This issue is labeled inactive because the last activity was over 90 days ago. This issue will be closed and archived after 14 additional days without activity.
#KeepAlive
Just to chip in, we've been using 2to3 and now regex (similar to https://github.com/protocolbuffers/protobuf/issues/7061#issuecomment-857511673) for... 5-6 years? Very strange to see such a large amount of voices asking for something and just... nothing.
Here's a standalone bash script:
Usage:
$ ./python-add-prefix.sh <python_generated_code_dir> <prefix>
#!/bin/bash
set -e
PROTO_PATH=$1
PREFIX=$2
if [ -z "$PROTO_PATH" ]; then
echo "Usage: $0 <path> <package-prefix>";
exit 1;
fi
if [ -z "$PREFIX" ]; then
echo "Usage: $0 <path> <package-prefix>";
exit 1;
fi
TARGET="$PROTO_PATH/$PREFIX"
if [ ! -d "$TARGET" ]; then
mv "$PROTO_PATH" "${PROTO_PATH}-tmp"
mkdir -p "$PROTO_PATH"
mv "${PROTO_PATH}-tmp" "$TARGET"
fi
cd "$TARGET"
for dir in $(find . -type d); do
pkgName="$( echo $dir | perl -pE 's/^\.\///g' | perl -pE 's/\//\./g' )"
echo "$pkgName -> $PREFIX.$pkgName"
pkgRegex="$( echo $pkgName | perl -pE 's/\./\\\./g' )"
find $dir -mindepth 1 -maxdepth 1 -type f -name "*.py*" | xargs perl -pi -E "s/^from ($pkgName)/from $PREFIX.\$1/g"
find $dir -maxdepth 1 -maxdepth 1 -type f -name "*.py*" | xargs perl -pi -E "s/^import ($pkgName)/import $PREFIX.\$1/g"
find $dir -maxdepth 1 -maxdepth 1 -type f -name "*.pyi" | xargs perl -pi -E "s/\[($pkgName)\./\[$PREFIX.\$1\./g"
done
Some notes:
- Using
perlinstead ofsedfor MacOS compatibility - Using
*.py*pattern to include*.pyifiles (mypy stubs)
Just ran into this
Particularly an issue with envoyproxy protocol buffers, which are really complex
Other compilers such as betterproto seem to have no issue in setting up the imports correctly
Just ran into this for a fairly simple example that's just two proto files, I don't think this is that much of an edge case.