flatbuffers [python] FR: Type annotations

The Python code (both generated and not) would be easier to read if it included type annotations. Would a PR adding these be accepted, or are there compatibility concerns with older versions of Python?

Dec 06 '20 16:12 rhofour

That sounds nice to add, yes.

There were discussions elsewhere whether we should at some point drop support for 2.x, but I am not sure if we'd upset anyone with that.

@lu-wang-g may have an opinion. @rw

Dec 07 '20 19:12 aardappel

@rhofour Such a PR would be welcome! I think it would be best if we made the type annotations optional, at first (toggled by a flatc flag?).

Dec 07 '20 20:12 rw

Currently, only the Python Object API has type hint, and it's only available for the attributes in a table. Dropping Python2 will make Type Hint a lot easier.

Dec 07 '20 20:12 lu-wang-g

It's almost been a year since Python 2 was formally deprecated.

If we put this behind a flag though then we don't really need to deprecate Python 2 here, we just say it's incompatible with this flag.

Dec 08 '20 00:12 rhofour

maybe what we could do is default to emitting Python 3 compatible code, have a --python2 flag (that you can use instead of --python or in addition, and omit types if given. That way, if we think of other Python 3 feature we want to start using, we can use the same flag.

Dec 10 '20 21:12 aardappel

I like that idea much more than just adding a type hint specific flag.

However, it still wouldn't let us use Python 3 features in the non-generated Python code so it would still be even nicer to just drop Python 2 support.

Dec 10 '20 22:12 rhofour

We should have a separate discussion about whether to remove Python 2 support. I'll just note that the last version of Python 2 was released in April 2020, so it's possible that there are a lot of people out there still using it who may not be able to upgrade to Python 3. However, maybe we can remove Python 2 support in flatbuffers, because users could just use older flatbuffers Python libraries if they need it. I don't have strong opinions either way, but I'll note that it's fairly easy for us to keep supporting Python 2, so my preference is to not rock the boat unnecessarily.

Dec 10 '20 23:12 rw

Yes, I'd say for now lets go with solutions that make Python 3 default, but Python 2 still possible, until we have stronger reasons to drop it (and we are further ahead in time).

Dec 11 '20 00:12 aardappel

If Flatbuffers is stable from a binary serialization point of view, couldn't Python 2 users use a particular version of flatc and the flatbuffers library and stick with it?

I constantly shoot myself in the foot because mypy has no visibility into what's going on in flatbuffers code (generated or library), heh.

Having the ability to emit type declarations in the generated code (or, for a Py2 point of view, keep them from being emitted) would be amazing.

Dec 14 '20 20:12 maxburke

Yes, older version of FlatBuffers should be able to keep reading newer FlatBuffers serialized data indefinitely.

It's a bit more complicated with schemas, if we add a new feature to the schema parser, older implementations may not be able to compile schemas using that new feature, and thus not generate code to access new fields in a schema. That should not happen a lot, but is possible.

Dec 14 '20 21:12 aardappel

FYI, as a Flatbuffer python API user, TFLite has stopped supporting Python3 to our users, so we are OK with this upgrade. In fact type annotations will be super helpful to Python developers.

Dec 15 '20 00:12 lu-wang-g

This issue is stale because it has been open 6 months with no activity. Please comment or this will be closed in 14 days.

Jun 15 '21 20:06 github-actions[bot]

Commenting to keep issue open. Included type annotations would be great.

Jun 17 '21 17:06 plwalsh

Just my 2 cents, but maybe a python2 and 3 compatible type hint will be fine as of now? e.g.:

# current generated code
def Start(builder): builder.StartObject(2)
def End(builder): return builder.EndObject()


# proposed generated code
def Start(builder):
    # type: (flatbuffers.Builder) -> None
    builder.StartObject(2)
# a bit hacky, but it works and will detect the returned `py_type`
def End(
    builder  # type: flatbuffers.Builder
    ):
    return builder.EndObject()

Some might require typing, which is not a standard module in python2, but since it'll be in a commented block anyway, we could get away with following code. you might prefer checking the python version, but you get the idea.

try:
    import typing
except ImportError:
    pass

Oct 08 '21 07:10 SubaruArai

This issue is stale because it has been open 6 months with no activity. Please comment or this will be closed in 14 days.

Apr 08 '22 20:04 github-actions[bot]

comment

Apr 08 '22 21:04 plwalsh

This issue is stale because it has been open 6 months with no activity. Please comment or label not-stale, or this will be closed in 14 days.

Mar 04 '23 01:03 github-actions[bot]

comment

Mar 04 '23 12:03 plwalsh

First stab at a PR for adding type annotations to flatc-generated code: #7858

Mar 05 '23 23:03 maxburke

@maxburke Haha, I was starting to make a pr myself (this issue has been for 3 years!), until I saw your pr. Thanks!

Mar 05 '23 23:03 SubaruArai

@maxburke Haha, I was starting to make a pr myself (this issue has been for 3 years!), until I saw your pr. Thanks!

I would love it if you were able to give it a try, see if you could find any problems with it too!

Mar 06 '23 00:03 maxburke

Hey everyone, this is excellent. I tried it in my project, and it works as expected, minus some code that doesn't get annotated.

For instance, I have this schema:

table Array {
    shape:[uint32];
    data:[float32];
}

root_type Array;

The generated code using flatc --python --python-typing *.fbs is below. As you can see, even though it worked for some of them, didn't annotate quite a bit of methods' return values.

# automatically generated by the FlatBuffers compiler, do not modify

# namespace: 

import flatbuffers
from flatbuffers.compat import import_numpy
from typing import Any
np = import_numpy()

class Array(object):
    __slots__ = ['_tab']

    @classmethod
    def GetRootAs(cls, buf, offset: int = 0):
        n = flatbuffers.encode.Get(flatbuffers.packer.uoffset, buf, offset)
        x = Array()
        x.Init(buf, n + offset)
        return x

    @classmethod
    def GetRootAsArray(cls, buf, offset=0):
        """This method is deprecated. Please switch to GetRootAs."""
        return cls.GetRootAs(buf, offset)
    # Array
    def Init(self, buf: bytes, pos: int):
        self._tab = flatbuffers.table.Table(buf, pos)

    # Array
    def Shape(self, j: int):
        o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(4))
        if o != 0:
            a = self._tab.Vector(o)
            return self._tab.Get(flatbuffers.number_types.Uint32Flags, a + flatbuffers.number_types.UOffsetTFlags.py_type(j * 4))
        return 0

    # Array
    def ShapeAsNumpy(self):
        o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(4))
        if o != 0:
            return self._tab.GetVectorAsNumpy(flatbuffers.number_types.Uint32Flags, o)
        return 0

    # Array
    def ShapeLength(self) -> int:
        o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(4))
        if o != 0:
            return self._tab.VectorLen(o)
        return 0

    # Array
    def ShapeIsNone(self) -> bool:
        o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(4))
        return o == 0

    # Array
    def Data(self, j: int):
        o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(6))
        if o != 0:
            a = self._tab.Vector(o)
            return self._tab.Get(flatbuffers.number_types.Float32Flags, a + flatbuffers.number_types.UOffsetTFlags.py_type(j * 4))
        return 0

    # Array
    def DataAsNumpy(self):
        o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(6))
        if o != 0:
            return self._tab.GetVectorAsNumpy(flatbuffers.number_types.Float32Flags, o)
        return 0

    # Array
    def DataLength(self) -> int:
        o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(6))
        if o != 0:
            return self._tab.VectorLen(o)
        return 0

    # Array
    def DataIsNone(self) -> bool:
        o = flatbuffers.number_types.UOffsetTFlags.py_type(self._tab.Offset(6))
        return o == 0

def ArrayStart(builder: flatbuffers.Builder):
    builder.StartObject(2)

def Start(builder: flatbuffers.Builder):
    ArrayStart(builder)

def ArrayAddShape(builder: flatbuffers.Builder, shape: int):
    builder.PrependUOffsetTRelativeSlot(0, flatbuffers.number_types.UOffsetTFlags.py_type(shape), 0)

def AddShape(builder: flatbuffers.Builder, shape: int):
    ArrayAddShape(builder, shape)

def ArrayStartShapeVector(builder, numElems: int) -> int:
    return builder.StartVector(4, numElems, 4)

def StartShapeVector(builder, numElems: int) -> int:
    return ArrayStartShapeVector(builder, numElems)

def ArrayAddData(builder: flatbuffers.Builder, data: int):
    builder.PrependUOffsetTRelativeSlot(1, flatbuffers.number_types.UOffsetTFlags.py_type(data), 0)

def AddData(builder: flatbuffers.Builder, data: int):
    ArrayAddData(builder, data)

def ArrayStartDataVector(builder, numElems: int) -> int:
    return builder.StartVector(4, numElems, 4)

def StartDataVector(builder, numElems: int) -> int:
    return ArrayStartDataVector(builder, numElems)

def ArrayEnd(builder: flatbuffers.Builder) -> int:
    return builder.EndObject()

def End(builder: flatbuffers.Builder) -> int:
    return ArrayEnd(builder)

Aug 20 '23 04:08 tasansal

flatbuffers flatbuffers copied to clipboard

[python] FR: Type annotations

flatbuffers
flatbuffers copied to clipboard