protobuf
protobuf copied to clipboard
protoc-gen-go: generate MarshalJSON/UnmarshalJSON methods for messages
Per https://github.com/golang/protobuf/issues/183:
There's several other parts of the proto3 JSON spec that make it impossible to use the standard encoding/json package, like WKTs.
I don't want to have generated protos grow any extra dependencies, so depending on the jsonpb package isn't appropriate.
I think json
usage is pervasive enough that we should revisit this decision. Message types whose JSON encoding diverges from the standard encoding/json
package should provide MarshalJSON
and UnmarshalJSON
methods. Then users could use the standard package without the need to explicitly interact with jsonpb
.
Yes, it's possible that would add a few more dependencies for programs that don't strictly need them. To the extent that that's a problem, it's a problem that should be fixed in the compiler and linker: the marginal increase in build and link times due to the extra dependencies seems like it would be miniscule compared to the complexity of interacting with two separate and incompatible packages for JSON encoding (e.g. https://github.com/golang/protobuf/issues/248).
Ok so what would the generated MarshalJSON method do. Would it call jsonpb (which calls encoding/json again) or would it be actual generated code that marshals json faster than with reflect and gives packages like ffjson a run for their money?
@awalterschulze: This proposal is about the API, not the implementation.
Initially it would probably just call jsonpb. If we find it's worth the extra overhead to generate code (or use some other optimization technique to avoid reflection), it would be easy enough to change the implementation while preserving the same API.
Ok so encoding/json is going to call the generated MarshalJSON method which is going to call jsonpb which sometimes also calls encoding/json. Some psuedo circular dependency. We hope it doesn't become a loop. Sounds like any easy bug to make.
Maybe we can list the differences between encoding/json en jsonpb? Even if its just for the record.
I know one reason was the fact that non string keys could not be encoded by encoding/json. But the go1.7 release notes now state the following:
In earlier versions of Go, this package only supported encoding and decoding maps using keys with string types. Go 1.7 adds support for maps using keys with integer types: the encoding uses a quoted decimal representation as the JSON key. Go 1.7 also adds support for encoding maps using non-string keys that implement the MarshalText (see encoding.TextMarshaler) method, as well as support for decoding maps using non-string keys that implement the UnmarshalText (see encoding.TextUnmarshaler) method. These methods are ignored for keys with string types in order to preserve the encoding and decoding used in earlier versions of Go.
Which means this problem might not be a problem anymore?
What are the other differences between encoding/json and jsonpb?
Maybe we can list the differences between encoding/json en jsonpb? Even if its just for the record.
Differences I'm aware of:
- Field names. (Easy enough to fix with a proto compiler change, but it's a breaking change for current users of the
json
package with generated message types.) - The default behavior for
json
is to include default ("empty") values; the default behavior forjsonpb
is to omit them. -
jsonpb
encodes enums by name rather than as integers. -
jsonpb
recognizes protobuf Well-Known Types and parses them accordingly. - Maybe some differences involving numbers encoded as strings vs. numbers encoded as JS number literals?
- Maybe handling of
NaN
s andInfinity
?
Some of these things are solvable:
- Field names: maybe the new json_name feature can be used to have something consistent here?
- "empty": simply generate structures with omitempty?
- enums: We can generate MarshalJSON and UnmarshalJSON methods for enums, like was done in the past?
- WKTs: We can write MarshalJSON and UnmarshalJSON for those types.
- numbers vs strings literals: This might be a problem?
- NaNs, Infinity, I also don't know about this.
So what if we start by moving the jsonpb implementation to be closer to the encoding/json implementation. Maybe we can start by generating omitempty for proto3 and then simply copying the behaviour of encoding/json for this case? Then we could start handling well known types in a less specific way in the jsonpb package and rather use MarshalJSON and UnmarshalJSON implementations. Then maybe we can get to the point where the proto json marshaler is simply the encoding/json Marshaler with a few custom options? Then our generated types don't need to generate MarshalJSON and UnmarshalJSON methods?
Then maybe we can get to the point where the proto json marshaler is simply the encoding/json Marshaler with a few custom options?
I agree that that's a nice goal in principle, and I've retitled the proposal accordingly. I'm not sure whether it's feasible.
Yes this title is something I can get behind.
Maybe we can start by generating omitempty for proto3 and then simply copying the behaviour of encoding/json for this case?
Yes please, this would solve so many problems that are common: people deserializing with jsonpb
and serializing with json
Ok, so to sum up this discussion Step 1 would be:
- general
MarshalJSON
andUnmarshalJSON
for message structs that simply delegate tojsonpb
- remove the
json
tag on the struct to not confuse people - add a flag to the
protoc-gen-go
compiler that allows you fallback to the previous behaviour (noMarshalJSON
UnmarshalJSON
generationjson
tag is there) - document it in a the release notes
protoc-gen-go
I like this approach, because it gets us 90% there (you can use encoding/json
like normal humans), there is a fallback for prior users of json, and the jsonpb
deserializes the "whole tree" (without sub message caling) so we de-risk any edge cases.
Step 2 would be to consider code-generating the MarshalJSON
/UnmarshalJSON
functions that work in a tree-like fashion. I.e. if a message has a nested message, the relevant MarshalJSON
/UnmarshalJSON
of the nested message is called. This may be tricky with:
- numbers vs strings literals: This might be a problem?
- NaNs, Infinity, I also don't know about this.
But I'm sure we can have helper functions in
jsonpb
library as a stepping stone that handles the field values inside the codegenned methods.
Note that MarshalJSON
and UnmarshalJSON
would have to be opinionated on which options (fields in jsonpb.Marshaler
and jsonpb.Unmarshaler
) are used. It may be fine to have this as a default, but I don't think we can remove the jsonpb
package.
@zombiezen True, but I think the protobuf JSON spec and the Go encoding/json
package provide pretty clear defaults.
The Marshaler
options:
-
EnumsAsInts
is not allowed by the spec: it requires "[t]he name of the enum value as specified in proto". -
EmitDefaults
is explicitly allowed and must default tofalse
. -
Indent
is allowed (the spec is silent about whitespace), so it should default to the same formatting produced byencoding/json
. -
OrigName
is not allowed by the spec: it requires that "names are mapped to lowerCamelCase".
The Unmarshaler
options:
-
AllowUnknownFields
is not addressed in the spec. We should probably get some clarification on that, and/or check the published conformance-test suite.
SGTM. Just wanted to make sure we thought about it.
I just wanted to add a note: the json StructTag value of "-" isn't checked. I found out about this in a new project, where I added json:"-" for a password field (just to be sure) - it didn't work. So it would seem, there is currently no way to ignore a value with jsonpb.
@morphar Curious, how did you set the JSON struct tag value of "-" on a generated type field? If this is for a dynamic message type, you can implement JSONPBMarshaler/JSONPBUnmarshaler. See https://github.com/golang/protobuf/blob/master/jsonpb/jsonpb_test.go#L767.
any progress on this? need help?
In case anyone is interested, I implemented protoc-gen-go-jsonpb to generate MarshalJSON/UnmarshalJSON functions for messages.
Would love MarshalJSON
/UnmarshalJSON
methods to be generated for Message types. Would make it a lot easier to deal with map/array types as well, such as the following:
message A {
string name = 1;
map<string, B> b_field = 2;
}
message B {
float64 C = 1;
}
In this case, say we have a backend service that needs to serialise/deserialise only A.b_field
:
func (h *Handler) storeBFields(a *mypackage.A) error {
b := a.GetBField()
jsonpb.Marshal(b) // <--- Wrong: map[string]*mypackage.B does not implement proto.Message
}
By having MarshalJSON
/UnmarshalJSON
methods, it would allow users to use only stdlib encoding/json
's Marshal/Unmarshal functions to unmarshal protobuf to it's exact JSON mapping, without using hacky workarounds.
Perhaps related, at least as far as decoding enums is concerned: https://github.com/golang/protobuf/issues/866
Whoops I missed this dup when I reported my issue (#926).
My stab is similar to @tnarg above, I didn't see this thread and googling didn't bring anything up. We've now started using this and so far has been working great: https://github.com/mitchellh/protoc-gen-go-json
It'd be great to have this sort of functionality built-in.
Is this waiting on the refactor mentioned in #266 for v2
of this pkg?
Anything members of the community can help out with to move this particular set of issues through? Very much would like to see this land in the core pkg.
Huh, this isn’t such a bad idea. Since we can override JSON marshaling/unmarshalling per message to call into jsonpb
instead, we can avoid incidents of people confusing which one they are supposed to be using. And meanwhile, standard encoding/json
does not have to support any of the unique issues of protobuf JSON encoding…
Thanks @mitchellh, that package works perfectly! Would love to see this built in as well.
Is there a status on this? I recently came across this issue when trying to wrap a protobuf generated struct inside another struct that contained other non protobuf generate structs. Now I can't simply use json.Unmarshal/Marshal because the protobuf related struct requires special handling related to the enum type fields. I'd be super happy to have these protobuf structs implement the MarshalJSON and UnmarshalJSON functions with some sane defaults.
Please let me know if this is ready to be worked on.
@jbgrunewald , use https://github.com/mitchellh/protoc-gen-go-json
Thanks, this looks very promising.
James Grunewald
On Dec 4, 2020, at 8:25 PM, Tamal Saha [email protected] wrote:
@jbgrunewald , use https://github.com/mitchellh/protoc-gen-go-json
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.