prost icon indicating copy to clipboard operation
prost copied to clipboard

A more ergonomic enumeration representation

Open sfackler opened this issue 4 years ago • 25 comments

To deal with the "openness" of Protobuf enumeration types, Prost currently represents them in the message struct as i32 and generates a Rust enum that can convert to and from i32. This is a bit awkward to work with, though.

A better approach could be to use a newtype with associated constants. To use the Syntax enum as an example, it would look like this:

#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord, ...)]
pub struct Syntax(i32);

impl Syntax {
    pub const PROTO2: Syntax = Syntax(0);
    pub const PROTO3: Syntax = Syntax(1);

    // alternatively the field could just be public
    pub fn from_raw(raw: i32) -> Syntax {
        Syntax(raw)
    }

    pub fn as_raw(self) -> i32 {
        self.0
    }
}

This approach has a couple of advantages:

  • It can be embedded directly in the message struct.
  • It's more type safe since you normally won't have to interact with the raw i32 value.
  • You can still use it roughly like a Rust enum, and do things like match on the variants:
match syntax {
    Syntax::PROTO2 => { ... }
    Syntax::PROTO3 => { ... }
    syntax => panic!("unknown syntax variant {}", syntax.as_raw()),
}

sfackler avatar Feb 03 '20 21:02 sfackler

@sfackler are you aware that type-safe getters are generated for enum fields? For instance: https://docs.rs/prost-types/0.6.1/prost_types/struct.Type.html#method.syntax

danburkert avatar Feb 03 '20 22:02 danburkert

I do like this idea, though. It definitely has some merit. My experience has been that working with enums through the typesafe getter/setter is 'good enough', but they have a discoverability problem. I don't think they are actually documented anywhere other than the resulting rustdoc.

danburkert avatar Feb 03 '20 22:02 danburkert

Ooh, I did not know about those!

sfackler avatar Feb 03 '20 22:02 sfackler

I'd be happy to build out the implementation, but I'd be interested on your thoughts on how to structure it. 0.6 is pretty new, so this could just be a breaking change and prost bumps to 0.7, or it could end up as a new flag in prost_build::Config to allow users to opt into the new approach.

sfackler avatar Feb 03 '20 22:02 sfackler

@sfackler @danburkert How's this going? I have been using prost more and more, and I do find the enums slightly painful; I think this proposal goes a long way to easing that.

jakeswenson avatar Mar 19 '20 22:03 jakeswenson

I'm going to work on this over the weekend.

sfackler avatar Apr 11 '20 00:04 sfackler

For an enum:

enum SomeEnum {
    SOME_ENUM_FOO = 0;
    SOME_ENUM_BAR = 2;
}

prost-build will produce:

#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord)]
pub struct SomeEnum(i32);

#[::prost::enumeration]
impl SomeEnum {
    pub const FOO: SomeEnum = SomeEnum(0);
    pub const BAR: SomeEnum = SomeEnum(2);
}

and prost-derive will expand that to:

#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord)]
pub struct SomeEnum(i32);

impl SomeEnum {
    pub const FOO: SomeEnum = SomeEnum(0);
    pub const BAR: SomeEnum = SomeEnum(2);

    pub fn is_valid(self) -> bool {
        match self {
            SomeEnum::FOO | SomeEnum::BAR => true,
            _ => false,
    }
}

impl Default for SomeEnum {
    fn default() -> SomeEnum {
        SomeEnum::FOO
    }
}

impl From<SomeEnum> for i32 {
    fn from(value: SomeEnum) -> i32 {
        value.0
    }
}

impl From<i32> for SomeEnum {
    fn from(value: i32) -> SomeEnum {
        SomeEnum(value)
    }
}

sfackler avatar Apr 11 '20 01:04 sfackler

Hi @sfackler, I've run into the same issue but found this before writing my own code. Is this currently WIP? I'd be happy to help out if so.

nw0 avatar Apr 23 '20 12:04 nw0

I spent a while working on it, but then got sidetracked with other things I've pushed what I have onto a branch but I'm not sure when I'll be able to circle back to it: https://github.com/sfackler/prost/commit/0330d92fcb81761c97780326bdc3f43a5649efd5

sfackler avatar Apr 23 '20 23:04 sfackler

Can anyone share some application code that this alternate representation cleans up? I'd love to be convinced since this seems to come up over and over, but I'm still not seeing how this is materially better than the generated getters + generated (Rust) enums.

danburkert avatar May 10 '20 20:05 danburkert

Are the generated getters still working? I have this Prost-generated code:

#[derive(Clone, PartialEq, ::prost::Message)]
pub struct SuiteRegistrationResponse {
    #[prost(enumeration = "SuiteAction", tag = "1")]
    pub suite_action: i32,
}
/// Tells the suite what action it should perform, based on the args that the API container received
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]
#[repr(i32)]
pub enum SuiteAction {
    /// Indicates that the testsuite should operate in metadata-serializing mode, sending suite metadata to the
    ///  API container
    SerializeSuiteMetadata = 0,
    /// Indicates that the testsuite should operate in test-executing mode, running a test
    ExecuteTest = 1,
}

which is throwing the following where when I try to use from:

error[E0308]: mismatched types
  --> lib/src/execution/test_suite_executor.rs:78:34
   |
78 |         let action = SuiteAction::from(action_int);
   |                                        ^^^^^^^^^^ expected enum `SuiteAction`, found `i32`

Version: prost 0.7, generated via tonic-build 0.4.0

mieubrisse avatar Feb 10 '21 18:02 mieubrisse

Yes, they are still present in prost 0.7. prost enums have never provided a From<i32> impl, I think you may be looking for the from_i32 method. Or you can use the SuiteRegistrationResponse::suite_action() getter.

danburkert avatar Feb 10 '21 18:02 danburkert

Gotcha, from_i32 was the missing bit (weirdly VS Code wasn't showing that as an option). Using that method works now!

mieubrisse avatar Feb 10 '21 19:02 mieubrisse

Yeah these methods are generated via derives, so RA doesn't seem to 'see' through it a lot of the time. Best way to work with prost in my experience is to generate the docs and look there; rust-doc will show the derive generated code.

danburkert avatar Feb 10 '21 20:02 danburkert

Maybe I'm missing something here but why not just generate a Rust enum with the #[non_exhaustive] attribute?

avsaase avatar Jun 09 '23 20:06 avsaase

The discoverability problem is pretty big imo. I want to expose a Prost generated struct as a function param in a library function. The rust docs just says i32 for the enum type. It would be much preferable if it actually showed the enum type and all variants.

@danburkert could we have a decision to move forward? There are multiple proposals better than the status quo, e.g Unknown(i32) or the proposal at the top of this issue.

benma avatar Jul 05 '23 17:07 benma

@avsaase

Maybe I'm missing something here but why not just generate a Rust enum with the #[non_exhaustive] attribute?

In proto3, the enums are open and the language guide suggests that unknown values are preserved in the message. I could not find anything more normative, but Prost currently conforms to this by erasing the enum type in message fields. While I mostly find myself needing more idiomatic domain types to work with protobuf-originated data anyway, this gives more room for programmer error than necessary.

~I lean towards this proposal~ This proposal has advantages over an added Unknown(i32) variant, since that one:

  1. injects a magical name which might come into conflict with an actually defined *UNKNOWN variant;
  2. loses #[repr(i32)] and makes the size of all generated enum types to be two 32-bit words rather than one.

A somewhat more cumbersome solution that preserves the enums could be a generic wrapper like this:

pub enum EnumFieldValue<T> {
    Known(T),
    Unknown(i32),
}

impl<T> EnumFieldValue<T> {
    pub fn unwrap(self) -> T {
        unimplemented!("for brevity")
    }

    pub fn known_or_else<E, F>(self, err: F) -> Result<T, E>
    where
        F: FnOnce(i32) -> E,
    {
        unimplemented!("for brevity")
    }
}

This has the same effect on the field size as Unknown(i32).

Regardless of the above, you'd want your proto3-derived enums to be annotated with #[non_exhaustive], as you'd want your structs (with some builder code to get over the initialization issues) since this is how bindings are expected to be evolved with protobuf's semver conventions.

mzabaluev avatar Mar 24 '24 21:03 mzabaluev

This has the same effect on the field size as Unknown(i32).

PhantomData to the rescue:

pub struct EnumFieldValue<T> {
    value: i32,
    _phantom: core::marker::PhantomData<T>,
}

impl<T> EnumFieldValue<T>
where
    i32: TryInto<T>,
{
    pub fn unwrap(self) -> T {
        let v = self.value;
        v.try_into()
            .map_err(|_| format!("unknown field value {}", v))
            .unwrap()
    }

    pub fn known_or_else<E, F>(self, err: F) -> Result<T, E>
    where
        F: FnOnce(i32) -> E,
    {
        let v = self.value;
        v.try_into().map_err(|_| err(v))
    }
}

mzabaluev avatar Mar 25 '24 03:03 mzabaluev