protobuf icon indicating copy to clipboard operation
protobuf copied to clipboard

Protoc generates invalid Java code for nested types with same name

Open donce opened this issue 6 months ago • 5 comments

What version of protobuf and what language are you using? Version: Latest - v31.1 release - protoc --version outputs 3.21.10 Language: Java

What operating system (Linux, Windows, ...) and version? macOS 15.5

What runtime / compiler are you using (e.g., python version or gcc version) Java 21 (latest LTS)

What did you do? Steps to reproduce the behavior:

  1. Create a proto file with nested types using same name, e.g. message.proto:
syntax = "proto3";

package com.example;

message Something {
	enum Something {
		A = 0;
		B = 1;
	}
}
  1. Generate Java code:
protoc message.proto --java_out=.

This produces Java code, which contains nested types with same name (Something inside Something), which is not valid in Java:


package com.example;

public final class Message {
  ...
  public static final class Something extends
      com.google.protobuf.GeneratedMessageV3 implements
      // @@protoc_insertion_point(message_implements:com.example.Something)
      SomethingOrBuilder {
  
    ...
    public static final class Something extends
        com.google.protobuf.GeneratedMessageV3 implements
        // @@protoc_insertion_point(message_implements:com.example.Something.Something)
        SomethingOrBuilder {
        ....
        }
  ...
}
  1. Download protobuf-java: Maven

  2. Compile generated Java code:

javac -cp protobuf-java-4.31.1.jar com/example/Message.java

What did you expect to see

Compilation to succeed.

What did you see instead?

Compilation error (first error is the root cause):

com/example/Message.java:80: error: class Something is already defined in class Message
    public static final class Something extends
                        ^
com/example/Message.java:746: error: incompatible types: com.google.protobuf.AbstractMessage.Builder cannot be converted to com.google.protobuf.GeneratedMessageV3.Builder
      return DEFAULT_INSTANCE.toBuilder().mergeFrom(prototype);
                                                   ^
com/example/Message.java:941: error: incompatible types: com.google.protobuf.GeneratedMessageV3.Builder cannot be converted to com.example.Message.Something.Builder
        Builder builder = newBuilder();
                                    ^
com/example/Message.java:952: error: incompatible types: com.example.Message.Something cannot be converted to com.example.Message.Something.Something
        return builder.buildPartial();
                                   ^
com/example/Message.java:935: error: incompatible types: <anonymous AbstractParser<com.example.Message.Something.Something>> cannot be converted to Parser<com.example.Message.Something>
        PARSER = new com.google.protobuf.AbstractParser<Something>() {

Anything else we should know about your project / environment

This was already reported ~4 years ago, but seems to have been closed: https://github.com/protocolbuffers/protobuf/issues/6987

Nevertheless, this is still an issue, and is becoming increasingly painful for us, especially when enabling protobuf-java to existing proto files (previously used with other languages), when migrating already widely used protos in multiple different languages is hardly an option...

One of directions for fixing this is to escape type names - e.g. this is something what ScalaPB does for reserved Java keywords (RESERVED in SchemaGenerators.scala), and it does the job well.

No backwards-compatibility issues should arise for fixing this, because currently, such code does not compile.

donce avatar Jun 18 '25 10:06 donce

At first glance, such proto might seem like an anti-pattern, or something which occurs very rarely, but it's actually quite common in our codebase with many independent teams - the main reason being proto enum limitation which lead to people nesting their enums into messages with same name: https://github.com/protocolbuffers/protobuf/issues/5425

donce avatar Jun 18 '25 12:06 donce

This particular example seems legitimate thing for us to add a mitigation for in our generator. I'm surprised that Java language itself even has this restriction.

We'll need to check on the history of the prior linked issue and get back to you next week (it doesn't seem like the prior issue was actually closed by a bot so there could be more context that wasn't communicated back), but IMO it would make sense to add a check for this case and append something like Inner or Nested on the inner scoped type for this case.

Note that our normal stance is not that any legal name in isolation should work (so, every generator has to deal with mangling names which are the corresponding language's keywords), but breakages that only occur due to two legal names being used next each other is usually "unfortunately wontfix".

The reason is that we unfortunately have no holistic strategy in our generators to deal with cross-field collision problems, and there's real downsides to trying to have 50 weird conditional renames sprinkled around in every generator to try to mitigate them, and it just always results in subsequent real bugs later if something else tries to reference a name and doesn't arrive at the same conclusion (including eg some other generator, or kotlin bindings, or whatever).

I think this specific example looks like it is the rare example of this type of collision report which I think probably does clear the bar for something that is worth the cost of a bespoke mangle check though.

esrauchg avatar Jun 26 '25 20:06 esrauchg

Great, let me know how it goes - we are very interested in this issue :)

Yeah, when escaped names start to clash, I do imagine it can get too complicated to resolve those issues, especially with backwards-compatibility, but indeed this is not that case, as it's just a simple proto which does not compile in Java, while it does work in other protobuf libs..

Appending something like Inner or Nested does sound good as well, just not entirely how that would work when same name is nested at 3 levels - Foo + FooInner + FooInnerInner? Though in our case, I'm not really sure if supporting any number of nesting is really important, I think all cases are 2 levels (mostly wrapped enums).

donce avatar Jun 27 '25 07:06 donce

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.

This issue is labeled inactive because the last activity was over 90 days ago. This issue will be closed and archived after 14 additional days without activity.

github-actions[bot] avatar Oct 13 '25 10:10 github-actions[bot]

Sorry this didn't get an answer. I'll bump this topic again internally.

esrauchg avatar Oct 14 '25 12:10 esrauchg