prost icon indicating copy to clipboard operation
prost copied to clipboard

JSON Mapping

Open cretz opened this issue 6 years ago • 12 comments

Are there any plans to support the official proto3 JSON mapping in prost-build? When building https://github.com/cretz/prost-twirp I have found a need for it. Even if it is not supported directly, the field/type attributes in the prost-build config are still difficult to use for this purpose. I suppose I could parse the protobuf myself w/ the other lib here and then create the field/type attributes though it would be nice if prost-build gave me the AST.

cretz avatar Feb 13 '18 18:02 cretz

I configure prost to derive Serialize in my build.rs:

let mut config = prost_build::Config::new();
config.type_attribute(".", "#[derive(Serialize)]");
config.type_attribute(".", "#[serde(rename_all = \"camelCase\")]");

Are there ways in which the proto3 JSON mapping differs from what serde can do?

adeschamps avatar Feb 13 '18 19:02 adeschamps

@adeschamps - I haven't tested, but there is at least one obvious problem with that approach. What if your protobuf fields used underscores in the .proto file? You would always change them back to camel case which is wrong. There are surely other reasons you'd either need the AST or need to be built in to the code generator which is making decisions a type/field at a time. It definitely doesn't differ from what serde can do, it's just about applying serde to match the mapping properly.

cretz avatar Feb 13 '18 20:02 cretz

Doesn't protobuf convert field names to camel case for JSON anyway?

adeschamps avatar Feb 13 '18 20:02 adeschamps

Ah yes, you appear to be correct. Maybe I will give this a whirl and see if it passes the conformance tests.

cretz avatar Feb 13 '18 20:02 cretz

I'm interested in doing something similar and I would love to know if this works too!

trainman419 avatar Feb 15 '18 04:02 trainman419

Hey all, this has been a really good discussion. I don't have any plans to use or work on proto3 JSON support, but if someone can figure out a way to do it in a straightforward way, ideally by leveraging Serde, then I'm all for it.

@cretz hopefully you've already seen it already, but there is an in-tree conformance test crate which skips the JSON tests. You should be able to hook that up, and add the JSON/Serde data to the protobuf crate.

danburkert avatar Feb 15 '18 16:02 danburkert

I am considering doing this as something I'm trying to support at work, is anyone else actively working on this? I was planning on building a new derive which will generate a serde_json compatible Serialize and Deserialize implementation.

jroesch avatar Feb 11 '20 22:02 jroesch

I have as of recent developed a need for prost to have JSON support, so I've been thinking about how to do this. I'm not going to be able to work on it immediately, but what I'm thinking is that prost-build should grow a new option which causes code generation to emit serde-derive annotations which result in Serialize/Deserialize impls compliant with the protobuf JSON spec. If anyone is interested in getting started on it, please feel free.

danburkert avatar Dec 30 '20 20:12 danburkert

I was digging into turning my prost generated models into something I can use with Serde json/yaml - mostly through manipulating config.type_attribute and config.field_attribute -- the biggest blocker I ran into was handling of Enum values, but it seems relevant to the topic at hand of supporting JSON Mapping, so figured I would share my research and design thoughts.

For example, if have a proto that outputs an enum of:

#[derive(serde::Serialize, serde::Deserialize)]
#[serde(rename_all = "camelCase")]
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, PartialOrd, Ord, ::prost::Enumeration)]
#[repr(i32)]
pub enum MyEnum {
    First = 0,
    Second = 1,
}

(I left on my serde attributes, so you can see them) - I think we will need a custom deserialiser for turning enum JSON into Rust values. Since it can take either the name (e.g. "First") or the integer value (e.g. 0).

According to the spec:

enum | string | "FOO_BAR" | The name of the enum value as specified in proto is used. Parsers accept both enum names and integer values.

I don't think standard serde attributes are going to be good enough to handle this out of the box, we're going to need some custom tooling.

I'm just poking at prost<->serde stuff atm, but thought I'd share if it helps.

markmandel avatar Feb 05 '21 18:02 markmandel

It's very much beta software, and lacks complete coverage of all the well-known-types (see the issue tracker) as I've currently only implemented the ones needed by IOx, but I recently released pbjson-build which derives compliant serde implementations for prost generated types.

I could not find a nice way to do this using serde annotations as suggested on this issue, as the enumerations, 64-bit integers, and generally loose deserialization requirements made this at least not obvious to myself.

Disclaimer: I am the author of this crate

tustvold avatar Sep 29 '21 09:09 tustvold

+1 currently relying on envoy proxy to handle transcoding from json to protobuf and vice versa -> important to note hacking Serialize / Deserialize into the output does not handle nested oneof structures, among other things such as lowercase fields, which breaks compliance

lacasaprivata2 avatar Oct 21 '21 12:10 lacasaprivata2

Using Serde also fails for prost_types::Any, which does not implement the Serialize attribute.

thurn avatar Jan 02 '22 15:01 thurn

Json support is listed in the roadmap #624 . What is the plan for achieving this? Can we work on integrating pbjson (great work btw!) into this project somehow?

Ideally, it would simply be a case of enabling an option in prost-build, and a feature flag on prost-types.

thburghout avatar Nov 12 '22 11:11 thburghout

There currently isn't much of a plan since I don't have time to work on feature work here right now. The 1.0 plan is more of a want than actually something I think we can achieve right now. I think we could eventually try to get pbjson merged into here.

LucioFranco avatar Dec 12 '22 17:12 LucioFranco

I think we could eventually try to get pbjson merged into here

I would be more than happy to help make this happen, just let me know. I don't foresee any issues from my side with a potential donation if that is something you would be interested in. It would be awesome to resolve the current pbjson-types vs prost-types situation (https://github.com/influxdata/pbjson/issues/75)

tustvold avatar Dec 12 '22 17:12 tustvold

Yeah, I think to move this forward the best way would be to write a proper proposal that I can review in an issue that way we can agree on what makes sense before anyone writes a PR. Feel free to tag me in that issue if you're interested in taking that on. 😄

LucioFranco avatar Dec 12 '22 19:12 LucioFranco

Are there ways in which the proto3 JSON mapping differs from what serde can do?

Found one, protobuf bytes type maps to a base64 encoded string. I copied a workaround from here https://github.com/serde-rs/serde/issues/661#issuecomment-269858463 with code like this

use serde::Serializer;

pub fn as_base64<S>(
    data: &Vec<u8>,
    serializer: S,
) -> Result<<S as Serializer>::Ok, <S as Serializer>::Error>
where
    S: Serializer,
{
    serializer.serialize_str(&base64::encode(&data[..]))
}

and then in build.rs

    config.field_attribute(
        "ImagePayload.data",
        "#[serde(serialize_with = \"crate::utils::as_base64\")]",
    );

RedKinda avatar Jan 15 '23 19:01 RedKinda

I think to move this forward the best way would be to write a proper proposal that I can review in an issue

I'm realistically not going to have time to undertake this in the near term, but would be happy to help with IP clearance, etc... if someone else wishes to pick this up.

Are there ways in which the proto3 JSON mapping differs from what serde can do?

Off the top of my head, the major differences are:

  • Oneof is flattened when serialized
  • Integers and enumerations can be deserialized from strings
  • 64-integers are serialized to strings
  • Bytes are base64 encoded
  • Well-known types are serialized in custom ways
  • Fields are named differently, and the JSON name can be overriden in proto definitions

Using serde derive directly is fine if you just want a JSON serialization, if you want the protobuf serialization, you are very likely to run into mismatches between the two

tustvold avatar Jan 15 '23 19:01 tustvold