prost
                                
                                 prost copied to clipboard
                                
                                    prost copied to clipboard
                            
                            
                            
                        prost_build: type_attribute path incorrectly matches submessages (no way to specify exact match)
If there is a nested message, adding a type attribute to an outter message will add the type attribute to the inner message as well. Example, given the following message:
syntax = "proto3";
package mydemo;
message Outter {
	enum Inner {
		A = 0;
		B = 1;
	}
	repeated Inner inners = 1;
	float field = 2;
}
and the following build.rs:
use std::io::Result;
fn main() -> Result<()> {
    let mut config = prost_build::Config::new();
    config.type_attribute(".mydemo.Outter", "#[serde(default)]"); 
    config.compile_protos(&["proto/demo.proto"], &["proto/"])
}
Expected:
I would expect only mydemo.Outter to have a #[serde(default)] annotation.
Actual: This will also apply an annotation to mydemo.Outter.Inner. Here is the generated rust:
#[serde(default)]
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct Outter {
    #[prost(enumeration="outter::Inner", repeated, tag="1")]
    pub inners: ::prost::alloc::vec::Vec<i32>,
    #[prost(float, tag="2")]
    pub field: f32,
}
/// Nested message and enum types in `Outter`.
pub mod outter {
    #[serde(default)]
    #[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]
    #[repr(i32)]
    pub enum Inner {
        A = 0,
        B = 1,
    }
}
It can be seen that Inner incorrectly has a #[serde(default)] annotation. This is an issue in my case, because a #[serde(default)] cannot apply to an enum (and without the annotation on Inner, Outter can be default since inners is a Vec).
Work around tried:
- add a specific empty type annotation for Inner: config.type_attribute(".mydemo.Outter.Inner", "");this did not work, regardless oftype_attributeordering.
Potential solutions:
- add an additional "end of match" delimiter (example: "$"), so you can specify something like .mydemo.Outter$
- use a more standard matching engine such as glob. (or regex, though that tends to be less user-friendly)
- add a visitor pattern, where the user provides a closure that that allows introspection on each type/field before producing an annotation.
- add first class serde support that is aware of typing issues, so you can say "enable serialization for all valid types under path"
IMO the visitor solution is probably the most robust and powerful. For example, the visitor closure can be provided a prost_types::DescriptorProto or FieldDescriptorProto respectively, and produce a string. My secondary preference would be glob. The current matching function isn't particularly well documented, and having a standard matching function would make it easier to grok.
I'm using prost_build version 0.9. Please let me know if there is a workaround here.
I am also running into https://github.com/tokio-rs/prost/issues/504, which seems to be an unrelated but similar issue with path matching.
Looks like the code generator (https://github.com/tokio-rs/prost/blob/d45a37cab69c2a0cda36a97df8823fcaf43dc1fe/prost-build/src/code_generator.rs#L185) has access to both the DescriptorProto or FieldDescriptorProto as well as the Config at the appropriate place. So adding a visitor likely wouldn't require much (or any) refactoring.
I think you would have to do the following:
- add a member type_visitors: PathMap<&'a mut FnMut(&DescriptorProto) -> Option<String>>andfield_visitors: PathMap<&'a mut FnMut(&FieldDescriptorProto ) -> Option<String>>to Config
- and add a lifetime parameter to Config to support the new members
- in CodeGenerator::append_type_attributes and CodeGenerator::append_field_attributes, add another loop to call each matching respective visitor and append any Some(String) returned.
- Add a new method to Config, such as pub fn type_visitor<'a, P>(&mut self, path: P, visitor: &'a mut FnMut(FieldDescriptorProto ) -> Option<String>) -> &mut Selfand similar forfield_visitor.
- Add tests. I think PathMap already covers the difficult parts, maybe just test multiple overlapping-path visitors, and visitors that return None.
+1 for the visitor based approach, I think this ultimately gives the most flexibility for the consumer, and it is consistent with the existing ServiceGenerator visitor API.
Hmm, I was making a quick mock up of the proposed solution and ran into a couple of ambiguities:
- enums use a EnumDescriptorProto, but call append_type_attributes
- Oneofs use a OneOfDescriptorProto but call append_type_attributes
- enums use a EnumValueDescriptorProto but call append_field_attributes
- Oneofs use a list of FieldDescriptorProtos
Seems like oneofs and enums are a little weird, but field and type attributes still apply to them. I guess the visitors can be provided a rust enum of the conflicting descriptor protos? But that seems a little ugly...
It seems like the ServiceGenerator uses a AST Service struct, even though a ServiceDescriptorProto exists.
It seems the AST struct also provides the proto name, package, and comments but is otherwise the same info.
It looks like the AST struct is [created from the ServiceDescriptorProto](https://github.com/tokio-rs/prost/blob/922a9a163f79e06c2b7a89c0bf417aa50e9081b8/prost-build/src/code_generator.rs#L703) and passed into any ServiceGenerator, and that's it's sole purpose.
Likely a similar structure should be mimic'd. With a AST Type and AST Field struct being created and passed to whatever visitor.
But, for ServiceGenerator, they use a trait pattern, passed to the Config. I suppose this makes sense because a Config is supposed to only have one ServiceGenerator. IMO, it would make sense to provide a closure instead of a trait object as a type/field visitor, as I imagine it would make sense to have multiple and using a closure would be much less verbose.
I suppose, like the ServiceGenerator, a buf: &String could be provided as an output parameter (to prevent allocation).
Maybe adding a visitor to a config would look like:
     config.type_visitor("path.to.package", |ty, buf| {
         if(ty.name.endswith("GRPC") {
             buf.push_str("#[serde(rename_all = \"camelCase\")]");
         }
    });
(much lighter than a trait IMO)
In hopes that future searches will find this bug; it applies to message_attribute() as well; a mutual dependency/nested message will be matched by message_attribute() and have the supplied attribute written to its definition.
Additionally,  this issue conspires with the fact that successive calls to compile() will overwrite files emitted by prior calls. Meaning:
- 
the order in which a single threaded build.rs calls compile()may cause a mutual dependency to be compiled an inscrutable number of times, with each successive call overwriting the prior's attributes making refactors very surprising.
- 
Build scripts that concurrently compile separate files with mutual dependencies are in a data race. 
Although fixing this will be a breaking change, it may be more discoverable for users if the current compile() method was deprecated with an explanatory message and a compile_v2() exposed in a minor semver bump.