prost
                                
                                 prost copied to clipboard
                                
                                    prost copied to clipboard
                            
                            
                            
                        Make it easy to write forward-compatible code: struct builders?
In protobuf, adding a field to a message is considered a backward compatible change. It’s compatible at the wire format level and also at the source level for many sets of bindings, including the standard bindings for Python, Go, C++, and Java.
Prost makes it easy to directly initialize structs of message type:
// message Person { string name = 1; string email = 2; }
let person = proto::Person {
    name: "A. U. Thor".into(),
    email: "[email protected]".into(),
};
But then if a new type is added to the message, the Rust code fails to compile, because it’s no longer initializing all the fields.
One way to achieve compatibility is to spread in ..Default::default():
let person = proto::Person {
    name: "A. U. Thor".into(),
    email: "[email protected]".into(),
    ..Default::default() // more fields may be added later
};
This works, but (a) you have to remember to do it every time, and (b) it can get a bit spammy when there are multiple layers of nesting:
            ..Default::default()
        },
        ..Default::default()
    },
    ..Default::default()
}
It would be nice to have a way to guarantee better compatibility. One approach would be some kind of optional builder API:
let person = proto::Person::builder()
    .name("A. U. Thor".into())
    .email("[email protected]".into())
    .build();
I don’t know what exactly the right ergonomics are here, especially around nested fields. (I half-wonder if the builder should take a closure, but I think that that might get a bit hairy.)
A nice side-effect of such a builder pattern would be that the methods
could take impl Into<FieldType>, so you could pass &str literals
rather than having to write .to_string()/.into() everywhere.
An option to set #[non_exhaustive] on structs doesn’t sound quite
right to me, since (a) that doesn’t do anything inside the same crate
and (b) then you can’t use functional record updates (..default()).
It’s fine with me if the struct fields remain public; we can have a style recommendation that says “always use builders when building messages of a type that you don’t control”, or something. Though being able to control that via a codegen option would be nice, too.
This is an important consideration for using Prost in shared or large codebases, especially monorepos. It’s common to have a shared protobuf definition with autogenerated bindings that many teams depend on. Enabling the proto owners to make forward-compatible changes without breaking other teams’ builds would be a requirement to use Prost.
I don't agree that adding a field to a message being a backwards compatible wire change is sufficient motivation to also guarantee that it's a source-compatible change.  There are plenty of other guaranteed wire-compatible changes that typically are not source compatible in any (typed) language bindings, e.g. widening an integer, moving a field into a oneof, renaming a field, removing an optional field, etc.
I do think #[non_exhaustive] is the way forward for users who want to expose a prost-generated type in a public crate API that needs stability guarantees.  I don't personally think this is a wise thing to do, but a lot of people seem to want to do it 🤷 .  For workspace-style private codebase I find that the compiler pointing out where code needs to get updated is extremely helpful.
(b) then you can’t use functional record updates (
..default())
This appears to work just fine: https://gist.github.com/a64fe9b1dd55d89ed8687381395f02d4
This appears to work just fine
Maybe that's an artifact of the type being in the same crate?  I haven't used #[non_exaustive] in anger since it's a relatively new feature, so I'm not exactly sure what its limitations are.
Hi! Thanks for your response.
I don't agree that adding a field to a message being a backwards compatible wire change is sufficient motivation to also guarantee that it's a source-compatible change. There are plenty of other guaranteed wire-compatible changes that typically are not source compatible in any (typed) language bindings, e.g. widening an integer, moving a field into a
oneof, renaming a field, removing an optional field, etc.
Agreed: not all wire-compatible changes are source-compatible. But adding a field specifically is meant to be both wire-compatible and source-compatible. The guidelines (also linked above) say that fields “may be added to existing APIs in the same version”, and also say a single version “must” be source- as well as wire- and semantic-compatible, which implies the conclusion.
Correspondingly, even though Go, Java, and C++ are all typed, their language bindings treat this as a source-compatible change. Java and C++ only expose getters and setters (and/or builders). Go has structs like Prost does, but omitted fields in Go struct literals are implicitly set to zero values.
For workspace-style private codebase I find that the compiler pointing out where code needs to get updated is extremely helpful.
Absolutely; I’m a huge fan of this workflow as well (“change one type,
fix all the errors, it works first try”). To quote from Abseil’s “Use
Exhaustive switch Statements Responsibly” (h/t @nfelt), for
this to work, you need one of three things:
- The owner of the
enumguarantees no newenumerators will be added,- The owner of the
enumis willing and able to fix our code when new enumerators are added (e.g. theenumdefinition is part of the same project),- The owner of the
enumwill not be blocked by breaking our build (e.g. their code is in a separate source control repository), and we’re willing to be forced to update our switch statements when updating to the latest version of theenum-owner’s code.
The “change one type, fix all the errors, it works first try” flow is basically picking option (2), but for published protos this doesn’t work. For protos that aren’t set in stone, this leaves (3), which may be impossible (monorepo) or undesirable (if you want to take the proto semantics).
This appears to work just fine
Maybe that's an artifact of the type being in the same crate?
Yeah, that’s right:
$ cat sub.rs
#[non_exhaustive]
#[derive(Debug, Default)]
pub struct S {
    pub a: i32,
    pub b: i32,
}
$ cat main.rs
extern crate sub;
fn main() {
    let s = sub::S {
        a: 1,
        b: 2,
        ..Default::default()
    };
    dbg!(s);
}
$ rustc sub.rs --crate-type=lib
$ rustc main.rs -L.
error[E0639]: cannot create non-exhaustive struct using struct expression
 --> main.rs:3:13
  |
3 |       let s = sub::S {
  |  _____________^
4 | |         a: 1,
5 | |         b: 2,
6 | |         ..Default::default()
7 | |     };
  | |_____^
error: aborting due to previous error
For more information about this error, try `rustc --explain E0639`.
By the way, I played with sketching out an API for this, and was pleased
to see that if you put the builder-related methods in traits,
then the whole thing works fine even if the proto happens to have a
field called build or something else pathological. (Maybe obvious to
you, but it’s not too often that I see the tangible benefits of trait
methods being namespaced as they are.)
The guidelines (also linked above) say that fields “may be added to existing APIs in the same version”, and also says a single version “must” be source- as well as wire- and semantic-compatible.
That link appears to be to Google Cloud Client guidelines, that's not really relevant to the design of a general purpose protobuf library. I haven't seen any protobuf docs or guidelines that say anything about source compatibility of generated code in any context, and in practice there is no such guarantee as I pointed out.
WRT #[non_exhaustive] that's pretty unfortunate, it makes it a little less useful of a tool.  It's still possible to use #[non_exhaustive] without a builder API though, doing something like:
let mut msg = MyMessage::default();
msg.foo = "bar".to_owned();
...
Also note that you could in theory attach a general purpose builder derive macro to prost generated types, e.g. https://crates.io/crates/derive_builder.  I haven't tried this so maybe there's some sticking points, but I believe all the hooks are available in prost-build to make it possible.
Combination of #[non_exhaustive] and builder sounds like a best option to me since then it's possible to match on the struct and access the fields directly without creating compatibility hazard.
I kinda like the idea of general-purpose builder although I'm not sure if there's a reason why prost-specific could be better. One thing that comes to my mind is avoiding naming conflicts (if someone funny names protobuf messages Foo and FooBuilder).
My use case is a library for quite large gRPC protocol that I don't have the time to write cleaner bindings for. (I wish I had.)
https://lib.rs/crates/proto-builder-trait
This seems to do the job? Maybe it could be upstreamed?
derive_builder does away with a lot of the boilerplate and is very flexible, but I think a code generator using specific knowledge of the protobuf type system would be able to provide a better API out of the box.
Consider repeated fields:
message HelloRequest {
    repeated string flags = 2;
}
The builder for HelloRequest could provide this setter method:
impl HelloRequestBuilder {
    pub fn flags<T>(mut self, flags: impl IntoIterator<Item = T>) -> Self
    where T: Into<String>,
    {
        self.inner.flags = flags.into_iter().map(Into::into).collect();
        self
    }
}
Stylistically, I don't like the builder type becoming a part of the API that you have to use by name. It's better to add an associated fn to the message type, and have the builder type tucked away in a submodule:
impl HelloRequest {
    pub fn builder() -> hello_request::HelloRequestBuilder {
        todo!()
    }
}
As realized in #901, a protobuf-specific generator can provide much more ergonomic API than a general purpose derive.