flatbuffers
flatbuffers copied to clipboard
[python] FR: Type annotations
The Python code (both generated and not) would be easier to read if it included type annotations. Would a PR adding these be accepted, or are there compatibility concerns with older versions of Python?
That sounds nice to add, yes.
There were discussions elsewhere whether we should at some point drop support for 2.x, but I am not sure if we'd upset anyone with that.
@lu-wang-g may have an opinion. @rw
@rhofour Such a PR would be welcome! I think it would be best if we made the type annotations optional, at first (toggled by a flatc flag?).
Currently, only the Python Object API has type hint, and it's only available for the attributes in a table. Dropping Python2 will make Type Hint a lot easier.
It's almost been a year since Python 2 was formally deprecated.
If we put this behind a flag though then we don't really need to deprecate Python 2 here, we just say it's incompatible with this flag.
maybe what we could do is default to emitting Python 3 compatible code, have a --python2
flag (that you can use instead of --python
or in addition, and omit types if given. That way, if we think of other Python 3 feature we want to start using, we can use the same flag.
I like that idea much more than just adding a type hint specific flag.
However, it still wouldn't let us use Python 3 features in the non-generated Python code so it would still be even nicer to just drop Python 2 support.
We should have a separate discussion about whether to remove Python 2 support. I'll just note that the last version of Python 2 was released in April 2020, so it's possible that there are a lot of people out there still using it who may not be able to upgrade to Python 3. However, maybe we can remove Python 2 support in flatbuffers, because users could just use older flatbuffers Python libraries if they need it. I don't have strong opinions either way, but I'll note that it's fairly easy for us to keep supporting Python 2, so my preference is to not rock the boat unnecessarily.
Yes, I'd say for now lets go with solutions that make Python 3 default, but Python 2 still possible, until we have stronger reasons to drop it (and we are further ahead in time).
If Flatbuffers is stable from a binary serialization point of view, couldn't Python 2 users use a particular version of flatc
and the flatbuffers
library and stick with it?
I constantly shoot myself in the foot because mypy
has no visibility into what's going on in flatbuffers code (generated or library), heh.
Having the ability to emit type declarations in the generated code (or, for a Py2 point of view, keep them from being emitted) would be amazing.
Yes, older version of FlatBuffers should be able to keep reading newer FlatBuffers serialized data indefinitely.
It's a bit more complicated with schemas, if we add a new feature to the schema parser, older implementations may not be able to compile schemas using that new feature, and thus not generate code to access new fields in a schema. That should not happen a lot, but is possible.
FYI, as a Flatbuffer python API user, TFLite has stopped supporting Python3 to our users, so we are OK with this upgrade. In fact type annotations will be super helpful to Python developers.
This issue is stale because it has been open 6 months with no activity. Please comment or this will be closed in 14 days.
Commenting to keep issue open. Included type annotations would be great.
Just my 2 cents, but maybe a python2 and 3 compatible type hint will be fine as of now? e.g.:
# current generated code
def Start(builder): builder.StartObject(2)
def End(builder): return builder.EndObject()
# proposed generated code
def Start(builder):
# type: (flatbuffers.Builder) -> None
builder.StartObject(2)
# a bit hacky, but it works and will detect the returned `py_type`
def End(
builder # type: flatbuffers.Builder
):
return builder.EndObject()
Some might require typing, which is not a standard module in python2, but since it'll be in a commented block anyway, we could get away with following code. you might prefer checking the python version, but you get the idea.
try:
import typing
except ImportError:
pass
This issue is stale because it has been open 6 months with no activity. Please comment or this will be closed in 14 days.
comment
This issue is stale because it has been open 6 months with no activity. Please comment or label not-stale
, or this will be closed in 14 days.
comment
First stab at a PR for adding type annotations to flatc-generated code: #7858
@maxburke Haha, I was starting to make a pr myself (this issue has been for 3 years!), until I saw your pr. Thanks!
@maxburke Haha, I was starting to make a pr myself (this issue has been for 3 years!), until I saw your pr. Thanks!
I would love it if you were able to give it a try, see if you could find any problems with it too!
Hey everyone, this is excellent. I tried it in my project, and it works as expected, minus some code that doesn't get annotated.
For instance, I have this schema:
table Array {
shape:[uint32];
data:[float32];
}
root_type Array;
The generated code using flatc --python --python-typing *.fbs
is below. As you can see, even though it worked for some of them, didn't annotate quite a bit of methods' return values.
# automatically generated by the FlatBuffers compiler, do not modify
# namespace:
import flatbuffers
from flatbuffers.compat import import_numpy
from typing import Any
np = import_numpy()
class Array(object):
__slots__ = ['_tab']
@classmethod
def GetRootAs(cls, buf, offset: int = 0):
n = flatbuffers.encode.Get(flatbuffers.packer.uoffset, buf, offset)
x = Array()
x.Init(buf, n + offset)
return x
@classmethod
def GetRootAsArray(cls, buf, offset=0):
"""This method is deprecated. Please switch to GetRootAs."""
return cls.GetRootAs(buf, offset)
# Array
def Init(self, buf: bytes, pos: int):
self._tab = flatbuffers.table.Table(buf, pos)
# Array
def Shape(self, j: int):
o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(4))
if o != 0:
a = self._tab.Vector(o)
return self._tab.Get(flatbuffers.number_types.Uint32Flags, a + flatbuffers.number_types.UOffsetTFlags.py_type(j * 4))
return 0
# Array
def ShapeAsNumpy(self):
o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(4))
if o != 0:
return self._tab.GetVectorAsNumpy(flatbuffers.number_types.Uint32Flags, o)
return 0
# Array
def ShapeLength(self) -> int:
o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(4))
if o != 0:
return self._tab.VectorLen(o)
return 0
# Array
def ShapeIsNone(self) -> bool:
o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(4))
return o == 0
# Array
def Data(self, j: int):
o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(6))
if o != 0:
a = self._tab.Vector(o)
return self._tab.Get(flatbuffers.number_types.Float32Flags, a + flatbuffers.number_types.UOffsetTFlags.py_type(j * 4))
return 0
# Array
def DataAsNumpy(self):
o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(6))
if o != 0:
return self._tab.GetVectorAsNumpy(flatbuffers.number_types.Float32Flags, o)
return 0
# Array
def DataLength(self) -> int:
o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(6))
if o != 0:
return self._tab.VectorLen(o)
return 0
# Array
def DataIsNone(self) -> bool:
o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(6))
return o == 0
def ArrayStart(builder: flatbuffers.Builder):
builder.StartObject(2)
def Start(builder: flatbuffers.Builder):
ArrayStart(builder)
def ArrayAddShape(builder: flatbuffers.Builder, shape: int):
builder.PrependUOffsetTRelativeSlot(0, flatbuffers.number_types.UOffsetTFlags.py_type(shape), 0)
def AddShape(builder: flatbuffers.Builder, shape: int):
ArrayAddShape(builder, shape)
def ArrayStartShapeVector(builder, numElems: int) -> int:
return builder.StartVector(4, numElems, 4)
def StartShapeVector(builder, numElems: int) -> int:
return ArrayStartShapeVector(builder, numElems)
def ArrayAddData(builder: flatbuffers.Builder, data: int):
builder.PrependUOffsetTRelativeSlot(1, flatbuffers.number_types.UOffsetTFlags.py_type(data), 0)
def AddData(builder: flatbuffers.Builder, data: int):
ArrayAddData(builder, data)
def ArrayStartDataVector(builder, numElems: int) -> int:
return builder.StartVector(4, numElems, 4)
def StartDataVector(builder, numElems: int) -> int:
return ArrayStartDataVector(builder, numElems)
def ArrayEnd(builder: flatbuffers.Builder) -> int:
return builder.EndObject()
def End(builder: flatbuffers.Builder) -> int:
return ArrayEnd(builder)