python-betterproto icon indicating copy to clipboard operation
python-betterproto copied to clipboard

Naming collision with nested enums

Open qria opened this issue 4 years ago • 9 comments

Problem

Currently these two enums in the same file will be both compiled into the same name of ContentStatus

message Content {enum Status{ ... }}
enum ContentStatus{ ... }

Currently the behaviour is to silently overwrite one of the enums, therefore making all subsequent call of the enum possibly wrong.

Workaround

I have consulted @nat-n and they say they don't see a way to work around the issue currently, except to directly modify the generated code.

qria avatar Feb 16 '21 19:02 qria

The ideal solution for this would be a (breaking) change for nested enums to instead compile to nested classes in python instead of relying on naming concatenation.

nat-n avatar Feb 22 '21 19:02 nat-n

What is the policy on breaking changes of this project?

qria avatar Feb 23 '21 00:02 qria

I am looking into this, I note it's still a problem and was able to generate an example of it.

cetanu avatar Oct 14 '23 04:10 cetanu

I might need a tl;dr on how the enum ends up getting named and which python module in the generated tree it ends up in.

If I look at how this case is handled in other libraries/languages, I see that they put the nested enum into a module with the name of its parent

So in the tree there would be:

  • __init__.py with the message Content and ContentStatus
  • content.py with the enum Status

I am currently reading through betterproto/plugin/models.py and betterproto/plugin/parser.py to try and understand how the decision is made to put the enum Status in the same file as the message Content. Cannot say I have reached comprehension as yet.

cetanu avatar Oct 15 '23 07:10 cetanu

I've reached some understanding after looking at the OutputTemplate that is created for this nested enum example.
The avenue of creating another file, while not impossible (and IMO the preferable way to do this), will probably take too much work. The suggestion by @nat-n to make a nested class is probably the way to go, and is how I am going to approach this. Will make an attempt soon.

cetanu avatar Oct 15 '23 12:10 cetanu

This is much harder than I thought...

I've managed to turn this protobuf

syntax = "proto3";

package nested_enum;

message Test {
    enum Inner {
        NONE = 0;
        THIS = 1;
    }
    Inner status = 1;

    message Doubly {
        enum Inner {
            NONE = 0;
            THIS = 1;
        }
        Inner status = 1;
    }
}


message TestInner {
  int32 foo = 1;
}

message TestDoublyInner {
  int32 foo = 1;
  string bar = 2;
}

enum Outer {
    foo = 0;
    bar = 1;
}

message Content {
    message Status {
        string code = 1;
    }
    Status status = 1;
}

message ContentStatus {
    int32 id = 1;
}

into this code

@dataclass(eq=False, repr=False)
class Test(betterproto.Message):
    @dataclass(eq=False, repr=False)
    class TestDoubly(betterproto.Message):
        # It would be nice if I could strip the prefix from these class names
        class TestDoublyInner(betterproto.Enum):
            NONE = 0
            THIS = 1

        # this type hint is wrong; it is referencing the global TestDoublyInner
        status: "TestDoublyInner" = betterproto.enum_field(1)

    class TestInner(betterproto.Enum):
        NONE = 0
        THIS = 1

    status: "TestInner" = betterproto.enum_field(1)


@dataclass(eq=False, repr=False)
class TestInner(betterproto.Message):
    foo: int = betterproto.int32_field(1)


@dataclass(eq=False, repr=False)
class TestDoublyInner(betterproto.Message):
    foo: int = betterproto.int32_field(1)
    bar: str = betterproto.string_field(2)


@dataclass(eq=False, repr=False)
class Content(betterproto.Message):
    @dataclass(eq=False, repr=False)
    class ContentStatus(betterproto.Message):
        code: str = betterproto.string_field(1)

    status: "ContentStatus" = betterproto.message_field(1)


@dataclass(eq=False, repr=False)
class ContentStatus(betterproto.Message):
    id: int = betterproto.int32_field(1)


class Outer(betterproto.Enum):
    foo = 0
    bar = 1

See the inline comments for two of the problems I'm having so far.

cetanu avatar Oct 16 '23 02:10 cetanu

I've parked my progress in a branch. If someone else could give it a glance that would be good, maybe I'm off-track or approaching this the wrong way.

https://github.com/danielgtaylor/python-betterproto/commit/b9ec59f2a3e95afe5163197096adb91ade7ec36b

cetanu avatar Oct 16 '23 02:10 cetanu

Hey @cetanu, nice work, you're close.

In my view, the ideal approach is for the Python module/class structure to mirror that of the proto config. Unless I'm missing something following this principle should have the most elegant and least surprising or problematic result.

So just to be clear I think what we want is for the Test class to look like so:

@dataclass(eq=False, repr=False)
class Test(betterproto.Message):
    @dataclass(eq=False, repr=False)
    class Doubly(betterproto.Message):
        class Inner(betterproto.Enum):
            NONE = 0
            THIS = 1

        status: "Test.Doubly.Inner" = betterproto.enum_field(1)  # Note the use of fully qualified class name here

    class Inner(betterproto.Enum):
        NONE = 0
        THIS = 1

    status: "Test.Inner" = betterproto.enum_field(1)

I guess you agree?

So the problem is how to avoid using concatenated class names? Do you know where the concatenation is taking place? Could it be refactored to represent class identification with an object that retains the structure so it can be formatted in a context appropriate way? (e.g. a tuple subclass, sometimes so you can use take self[0], sometimes you take ".".join(self) ).

I'm afraid making progress in this codebase requires a willingness to refactor :P

I don't have time/motivation to really dig through the code right now, but I'm happy to offer this kind of high level collaboration.

nat-n avatar Oct 16 '23 09:10 nat-n

I agree completely, I just haven't quite worked out:

  1. where the names are concatenated
  2. where the type hints are also formed However what you've put as an example is the ultimate goal.

cetanu avatar Oct 16 '23 10:10 cetanu