protobuf
protobuf copied to clipboard
Expose container types in public API
What language does this apply to? Python
Describe the problem you are trying to solve.
We do have a library that automatically decodes some common protobuf types to native Python structures. E.g. ScalarMap
is decoded to dict
. The problem is that with arrival of protobuf==4.*
, protobuf no longer creates instance of google.protobuf.internal.containers.ScalarMap
, but instance of google._upb._message.ScalarMapContainer
instead. I don't particularly like importing this class from something that looks like a private module that may break at any time. Also, I don't really understand why protobuf has two classes that are very similar, but one is not inherited from the other. It's confusing.
Describe the solution you'd like
I'd like you to publish MessageMap
, RepeatedCompositeFieldContainer
, RepeatedScalarFieldContainer
, ScalarMap
as a public API that will only break in major versions. Of course, protobuf should create instances of these public classes so we can test them with isinstance
call.
Describe alternatives you've considered I can conditionally import from private module. This may be quite fragile.
If you do have any other suggestion how I might recognize ScalarMap
(and other types mentioned above) instead of testing them with isinstance
and without the need to import from private module, I'd appreciate that.
All of the internal implementation classes inherit from MutableMapping
, for example: https://github.com/protocolbuffers/protobuf/blob/9e0e7d5f59ecd2501ae638eeb1b65ccf361f008a/python/google/protobuf/internal/containers.py#L366
The public API for testing isinstance()
would be isinstance(container, MutableMapping)
.
Will that work for you?
Hi @haberman, thanks for the reply. I didn't get to this in a while. I'm afraid this works in Python protobuf 3.*
, but not in 4.*
. Example:
# proto/example.proto
message DictMessage
{
map<string, string> values = 1;
}
message ListMessage
{
repeated string values = 1;
}
# test.py
from proto.example_pb2 import DictMessage, ListMessage
d = DictMessage()
print(type(d.values).__mro__)
# outputs: (<class 'google._upb._message.ScalarMapContainer'>, <class 'collections.abc.MutableMapping'>, <class 'collections.abc.Mapping'>, <class 'collections.abc.Collection'>, <class 'collections.abc.Sized'>, <class 'collections.abc.Iterable'>, <class 'collections.abc.Container'>, <class 'object'>)
l = ListMessage()
print(type(l.values).__mro__)
# outputs: (<class 'google._upb._message.RepeatedScalarContainer'>, <class 'object'>)
So, the solution works for ScalarMapContainer
which is instance of MutableMapping
, but not for RepeatedScalarContainer
, which is not instance of MutableSequence
.
We have unit tests that verify that our sequences are instances of MutableSequence
: https://github.com/protocolbuffers/protobuf/blob/b9eb6bf759cf6edf449291215bd4ee2c572845d2/python/google/protobuf/internal/message_test.py#L706-L710
I tried your example and got similar results, however when I tried printing isinstance(l.values, MutableSequence)
it returned true.
So perhaps this is not reflected in __mro__
for some reason, but isinstance()
is behaving as expected.
Ahh, that's interesting. isinstance(l.values, MutableSequence)
returns True
although MutableSequence
is not listed in mro
. Now I'm confused. Why does MutableSequence
not show up in mro
?
I actually do have gRPC decoder which needs to recognize these types and uses mro
to discover decoding function for type that is closest to the type of object submitted to decoder. isinstance
won't tell me anything about the distance. I guess I could make some kind of fallback for this, but it would be kind of ugly.
It looks like we call MutableSequence.register(RepeatedScalarContainer)
instead of deriving from MutableSequence
: https://github.com/protocolbuffers/upb/blob/main/python/repeated.c#L807-L810
This is documented here: https://docs.python.org/3/library/collections.abc.html#module-collections.abc