prost icon indicating copy to clipboard operation
prost copied to clipboard

Nested messages cause cause overlapping module names

Open jpittis opened this issue 5 years ago • 16 comments

It looks like if you have a message with nested messages, prost will compile these into a rust module named after the parent message.

For example prost compiles Envoy's xDS API Cluster message from

package envoy.api.v2;
message Cluster {

into

pub mod cluster {

The trouble is when there's also a protobuf sub module with the same name as the parent message.

package envoy.api.v2.cluster;

These two modules will conflict:

pub mod v2 {
  tonic::include_proto!("envoy.api.v2");
  pub mod cluster {
      tonic::include_proto!("envoy.api.v2.cluster");
  }
}

I happen to be using tonic, but I believe this needs to be fixed in prost? Could be I'm confused and there already a config option to fix this... or maybe I should be opening the ticket on the tonic side?

Thanks <3

jpittis avatar May 15 '20 02:05 jpittis

I created a MVP reproduction of this issue using pure prost / prost-build.

https://github.com/jpittis/prost-311-repro

$ cargo build
   Compiling prost-331-repro v0.1.0 (/home/jpittis/code/prost-331-repro)
error[E0428]: the name `repro` is defined multiple times
 --> /home/jpittis/code/prost-331-repro/target/debug/build/prost-331-repro-12142094c2224c01/out/repro.rs:6:1
  |
6 | pub mod repro {
  | ^^^^^^^^^^^^^ `repro` redefined here
  |
 ::: src/lib.rs:3:5
  |
3 |     pub mod repro {
  |     ------------- previous definition of the module `repro` here
  |
  = note: `repro` must be defined only once in the type namespace of this module

error[E0412]: cannot find type `SubMessage` in module `repro`
 --> /home/jpittis/code/prost-331-repro/target/debug/build/prost-331-repro-12142094c2224c01/out/repro.rs:4:43
  |
4 |     pub sub: ::std::option::Option<repro::SubMessage>,
  |                                           ^^^^^^^^^^ not found in `repro`

error[E0283]: type annotations needed
 --> /home/jpittis/code/prost-331-repro/target/debug/build/prost-331-repro-12142094c2224c01/out/repro.rs:1:28
  |
1 | #[derive(Clone, PartialEq, ::prost::Message)]
  |                            ^^^^^^^^^^^^^^^^ cannot infer type
  |
  = note: cannot resolve `_: std::default::Default`
  = note: required by `std::default::Default::default`

error: aborting due to 3 previous errors

Some errors have detailed explanations: E0283, E0412, E0428.
For more information about an error, try `rustc --explain E0283`.
error: could not compile `prost-331-repro`.

To learn more, run the command again with --verbose.

If I rename the proto message in repro.proto, it builds without error:

$ git diff
diff --git a/src/repro.proto b/src/repro.proto
index a8bc6f6..c68cb9e 100644
--- a/src/repro.proto
+++ b/src/repro.proto
@@ -2,7 +2,7 @@ syntax = "proto3";

 package repro;

-message Repro {
+message Repro2 {
   message SubMessage {
     uint64 field = 1;
   }

jpittis avatar May 16 '20 22:05 jpittis

Thinking about this a bit more, it's a bit of a sticky problem because no matter how we prefix the rust module, we could always produce a sub proto module which conflicts.

Without knowing the history of this feature / much about prost, my initial thought is that nested messages should not be placed in rust modules with the name of their parent message. Instead, we should prefix the message name itself. (ie Repro_SubMessage rather than repro::SubMessage)

It's not exactly a satisfying solution but as of now, my understanding is that prost can't be used for a number of existing protobuf projects (for example Envoy's xDS API protobufs).

jpittis avatar May 16 '20 23:05 jpittis

We're running into this as well with the Envoy xDS API protobufs. Prefixing the message name itself seems to be the way to go, as this is how both Go and C++ handle this case.

samschlegel avatar Aug 03 '20 18:08 samschlegel

We're running into this as well with the Envoy xDS API protobufs. Prefixing the message name itself seems to be the way to go, as this is how both Go and C++ handle this case.

So I can clearly understand: you are modifying the name of the message in the Protobuf file, so that the generated rust code does not clash?

Running into this problem as well...

cetanu avatar Aug 07 '20 12:08 cetanu

I notice that the envoy protobufs have options such as the following:

option csharp_namespace = "Envoy.Api.V2.ClusterNS";
option ruby_package = "Envoy.Api.V2.ClusterNS";

Perhaps something like this could be included to also signal prost or protoc to handle the code generation differently. Just leaving this comment so that I don't forget.

cetanu avatar Aug 07 '20 15:08 cetanu

So I can clearly understand: you are modifying the name of the message in the Protobuf file, so that the generated rust code does not clash?

I've actually just put that side project on pause for now, but the solution seems to be to move nested messages out of a module, and prefix them with the containing message name, like how C++/Go/ObjC/PHP/Dart do it. https://developers.google.com/protocol-buffers/docs/reference/cpp-generated#message i.e. if you have message Foo { message Bar { } }, it would generate Foo and Foo_Bar.

This seems like a pretty easy change, but it's definitely a breaking one and I'm not sure how making breaking changes like that should work with prost. Is an option to the generator ok? How many major version changes would be ok before making that change the default?

I notice that the envoy protobufs have options such as the following: I've also thought about adding this, but more in the context of module organization.

Unfortunately it's blocked on #256 which itself is blocked on #317 which is waiting on review still.

samschlegel avatar Aug 07 '20 17:08 samschlegel

I ended up being able to rename a few folders and adjust package names to get this sort of working. I was able to set up a ClusterDiscoveryService stub as a really basic proof that the types are somewhat acceptable...

This utility downloads the protobufs: https://github.com/cetanu/sovereign-rs/blob/master/tools/download-protos.py It could be even more dynamic... Although "fixing" it this way seems wrong.

The rest of the repo has the really basic tonic things setup, with the CDS stub mentioned.

cetanu avatar Aug 20 '20 10:08 cetanu

This seems like a pretty easy change, but it's definitely a breaking one and I'm not sure how making breaking changes like that should work with prost. Is an option to the generator ok? How many major version changes would be ok before making that change the default?

Yes typically codegen things like this are exposed as an option on the prost-build config struct, then the application can control how the codegen happens. That being said prost derive's internally use a lot of relative paths, so it may be very tricky to expose this as a codegen option, since those typically only control the 'first level' of generated rust code (through prost-build), and not derive'd code (through prost-derive). That being said anything should be possible with enough annotations. I do think the current behavior is the correct default, since I think it's more idiomatic to put nested messages into submodules, but I'm in support of making it configurable.

danburkert avatar Aug 20 '20 16:08 danburkert

That being said anything should be possible with enough annotations. I do think the current behavior is the correct default, since I think it's more idiomatic to put nested messages into submodules

Without trying to get into a debate, I would counter that regardless of how idiomatic the current implementation is, it's incorrect because there exists valid protobuf structure that it cannot generate. For example, if the Go or C++ implementation had this behavior, it might be considered a bug? (I'm less familiar with the protobuf ecosystem so I could be very wrong.)

Regardless, adding a configurable feature is the next step here, default or not.

jpittis avatar Sep 06 '20 21:09 jpittis

revisiting this while triaging issues. Perhaps prost-build could be a bit smarter here and automatically recognize when it's generating a submodle, that a 'top-level' module with that name already exists, and move the submodule contents over to it. There is still the possibility for symbol collisions with nested message types, but I think it'd be relatively more rare. This is also starting to look more and more like something that might fall out naturally from a good 'single include' feature, which is something I'd really like to do.

danburkert avatar Nov 15 '20 20:11 danburkert

For example, if the Go or C++ implementation had this behavior, it might be considered a bug?

To be clear I definitely agree that this is a bug in prost, I'm just not sure what the best way to solve it is. prost generates code differently than C++'s code generator not only because of idioms, but because rust's module system is fundamentally different than C++'s namespaces. You wouldn't have this issue in C++ because namespaces can be split among multiple files. I can't comment on Go. Unfortunately protobuf's package design is hand-in-glove with C++'s namespace design, so occasionally things require workarounds.

danburkert avatar Nov 15 '20 20:11 danburkert

I recently encountered the same problem and stumbled upon this thread. (also while trying to compile Envoy XDS proto API).

Perhaps prost-build could be a bit smarter here and automatically recognize when it's generating a submodle, that a 'top-level' module with that name already exists, and move the submodule contents over to it.

I'm not sure this is the "right" approach? Protobuf package path and nested message are two different "namespaces". It's perfectly valid to have a package path that look similar to a nested message path without having to worry about naming collision. Treating both of them as Rust modules would be conflating these two into a single "namespace".

fiibbb avatar Jan 05 '21 22:01 fiibbb

Did anyone manage to solve this in their own repo and could show an example of what they did? I'm revisiting this again.

Perhaps prost-build could be a bit smarter here and automatically recognize when it's generating a submodle, that a 'top-level' module with that name already exists, and move the submodule contents over to it. ... This is also starting to look more and more like something that might fall out naturally from a good 'single include' feature, which is something I'd really like to do.

I think this would be pretty cool also. Just glancing over the code generating bits, I don't have a clue how I would even start on this 😄

cetanu avatar Dec 08 '21 10:12 cetanu

I have run into this problem as well, also with the Envoy xDS API protobufs.

gsalisbury avatar Apr 14 '22 14:04 gsalisbury

FYI that though this issue is not resolved in prost, the v3 xDS APIs no longer run into this scenario.

I've posted the generated v3 APIs as a crate at https://crates.io/crates/data-plane-api.

jpittis avatar Dec 26 '22 21:12 jpittis