flatbuffers icon indicating copy to clipboard operation
flatbuffers copied to clipboard

[Python] Render enums as Python IntEnum

Open fliiiix opened this issue 1 year ago • 9 comments

This allows enums to be type check with mypy. They will still behave like ints ->

IntEnum is the same as Enum, but its members are also integers and can be used anywhere that an integer can be used. If any integer operation is performed with an IntEnum member, the resulting value loses its enumeration status. https://docs.python.org/3/library/enum.html#enum.IntEnum

I think that should be fair enough since an enum is always a int:

Only integer types are allowed, i.e. byte, ubyte, short ushort, int, uint, long and ulong.

fliiiix avatar Nov 04 '23 09:11 fliiiix

I got a bit lost on how ImportMapEntry works but as far as i can see its not required to be passed along, but if required i can refactor to pass along the import statement.

fliiiix avatar Nov 04 '23 10:11 fliiiix

To avoid compatibility issues this should probably only be enabled when --python-typing is set on the command line.

akb825 avatar Nov 07 '23 07:11 akb825

I thought about that, but this while helping with type checking is in my opinion the correct way to generate an enum in python, the cpp code does also generate an enum type

enum Color {
  Color_Red = 0,
  Color_Green = 1,
  Color_Blue = 2,
  Color_MIN = Color_Red,
  Color_MAX = Color_Blue
};

and i could not come up with anyway this would break existing code

But if there good arguments against doing this by default im open to move this also behind the --python-typing flag.

fliiiix avatar Nov 07 '23 11:11 fliiiix

and i could not come up with anyway this would break existing code

Flatbuffers currently supports Python 2 (at least it's still advertised in setup.py), while Python enums were added in 3.4. You could question whether Python 2 support should be dropped given that it's EOL, which would be a question for the maintainers, but I'd imagine that breaking changes shouldn't be added until that officially happens.

the cpp code does also generate an enum type

It's worth noting that the default enums for C++ are the old C-style enums. With the --scoped-enum command-line option it creates the newer C++11 enum classes. In a lot of ways that's similar to the existing Python enum as constants in an object compared to using IntEnum.

akb825 avatar Nov 08 '23 03:11 akb825

@akb825 Fair enough, i missed the Python 2 support its behind the flag for now.

fliiiix avatar Nov 10 '23 08:11 fliiiix

Thanks! Don't forget to also update the generated code in the tests directory.

Another thought is that if you want to get the full benefit of type checking, you will probably also want to adjust the type annotations in the generated code (when --python-typing is set) for functions that take and return enum values to use the actual specific enum types rather than int. I'm not sure how difficult that would be given the current internal model to know the real type and manage the imports, and it may require some adjustments to the serialization code to assign the enum values appropriately.

Also, full disclosure: I'm not a maintainer and won't be able to approve this PR. The maintainers haven't been very active lately, so it may take some time to get it approved. (I myself noticed this PR when checking to see if there's been any activity lately since I'm waiting on getting a bugfix approved from late May...)

akb825 avatar Nov 11 '23 01:11 akb825

@dbaileychess what is missing to get this merged?

fliiiix avatar Jan 20 '24 09:01 fliiiix

Rebased to latest master, anything else required to get this merged? @dbaileychess

fliiiix avatar Feb 08 '24 15:02 fliiiix

@dbaileychess Would be wonderful to see this go in

colinRawlings avatar Mar 06 '24 17:03 colinRawlings

Updated and rebased on @anton-bobukh changes - any chance this can be merged?

fliiiix avatar Jun 01 '24 15:06 fliiiix

I've recently separated the generation of .pyi and .py files, and for this, you would need to update the stub generator: https://github.com/google/flatbuffers/blob/master/src/idl_gen_python.cpp#L591-L607.

Also, as pointed out above, enum is not a thing in Python2, and we should probably only use enum.IntEnum when the --python-version flag is set to Python3 only.

From what I know about enum.IntEnum, unlike C++ enums, it does not allow arbitrary values which will affect the backward-forward compatibility of the schema/generated code. For example, if the client adds a new enum value to the schema and sends it over to the server before the server is updated. In this case, we still want the server code to be able to handle the unknown value, not raise a ValueError. One way to solve this would be to only update the stubs and keep the .py files unchanged, let the generated functions to still use ints.

anton-bobukh avatar Jun 01 '24 17:06 anton-bobukh

for this, you would need to update the stub generator

good point i kinda assumed that should not be affected but that sounds wrong and i will look into it

and we should probably only use enum.IntEnum when the --python-version flag is set to Python3 only

It's behind the --python-typing. I don't understand the difference between this flag and the new --python-version flag https://github.com/google/flatbuffers/pull/8145/files#diff-7c5e528085818515d85558785f93cc109a78c0c683d6fc2555fad55c09b5ee3eR679

At least my understanding is none of the typing stuff will work in Python 2. For example combining --python-typing and selecting python2 will always give you invalid python2 code.

In this case, we still want the server code to be able to handle the unknown value, not raise a ValueError.

Can you explain to me how that currently works? As far as I can tell this problem exists in exactly the same way with the old code there is no way to add a value which does not exist

from enum import IntEnum

class Foo():
    Value1 = 1

class Bar(IntEnum):
    Value2 = 2

foo = Foo(3)
bar = Bar(3)

One way to solve this would be to only update the stubs and keep the .py files unchanged, let the generated functions to still use ints.

I will look into that :+1:

fliiiix avatar Jun 01 '24 19:06 fliiiix

My plan was to remove any type annotation from the generated .py file and fully rely on .pyi file for type checking, so --python-version controls the generation of .py files (e.g. doing class Fo(object): vs class Foo:), while -python-typing controls the generation of .pyi files.

I just realized that you don't change the way enums are read. Currently, we read them as integers (e.g. https://github.com/google/flatbuffers/blob/6ede1ccc9e24e00d5b19c19d6df0f09fdf1a64fe/tests/MyGame/Example/NestedStruct.py#L43), and backward-forward compatibility will only be an issue if we wrap those integers into the corresponding enum class. Doing so will raise a ValueError for unknown values (e.g. when the a new enum value is added and the client is updated to send that new value before the server is updated to handle it).

anton-bobukh avatar Jun 01 '24 19:06 anton-bobukh

I just realized that you don't change the way enums are read.

I switched it now to put all this info in the .pyi files so i touch the enums even less. This seems to be a relative simple and clean solution and in line with your plan, what do you think about this proposal?

fliiiix avatar Jun 02 '24 10:06 fliiiix

Hi there, please, can anyone explain to me, what is the point of annotating a type as (Int)Enum while the real type of the data remains "object"? Isn't it just a deliberate lie? The output type has remained "object" for looong time because of compatibility with Python 2. Since this commit, the new --python-version flag allows to forget about Python 2 for good.... if you just use this flag for the code generator as well (not just for the naming as the linked commit does). I would like to tell the generator "yes, I do not bother about EOL products, hit me with the features of the Python3". The --python-version seemed to be a good candidate for this... until I discovered that it is currently used just to escape keywords.

EDIT: Type of the enum is actually "object".

bagring avatar Aug 08 '24 13:08 bagring

Fair point with --python-version set to 3 and higher i guess we could render it as IntEnum instead of object. Putting this info in the typeshed for the type checker was good enough for my use case.

fliiiix avatar Aug 09 '24 13:08 fliiiix