ts-proto
ts-proto copied to clipboard
google.protobuf.Any to JSON special conversions
According to the official docs for Any the generated JSON should have this extra field @type and the value directly embedded. What I see in the generated code from ts-proto is something like:
export const Any = {
// ...
fromJSON(object: any): Any {
return {
typeUrl: isSet(object.typeUrl) ? String(object.typeUrl) : "",
value: isSet(object.value)
? bytesFromBase64(object.value)
: new Uint8Array(),
};
},
toJSON(message: Any): unknown {
const obj: any = {};
message.typeUrl !== undefined && (obj.typeUrl = message.typeUrl);
message.value !== undefined &&
(obj.value = base64FromBytes(
message.value !== undefined ? message.value : new Uint8Array()
));
return obj;
},
// ...
};
It would be nice if the Any type could be supported in ts-proto like in C++ with e.g. ParsedFrom/UnpackTo methods.
BTW, I am just starting with ts. So, if there are ways to circumvent this, please point me in the right direction. Thanks in advance.
Hey @mvukov yeah, you're right we don't have any special handling of Any for JSON.
I think we've got a little bit of the infra in place via the outputTypeRegistry output, which adds both message.$type as well as a map of "type name --> message object", both of which I assume would be necessary to parse incoming Any-containing JSON back into the right messages...
OK, will take a look at that as well. Thanks for helping out. Anyways, it would be nice to have this working out of the box eventually :).
If I understand correctly, this is what we want to support?
message MyMessage {
google.protobuf.Any payload = 1;
}
const jsonData = {
"payload": {
"@type": "http://.....",
"foo": "bar"
}
};
const myMessage = MyMessage.fromJSON(jsonData);
MyMessage.toJSON(myMessage) // equals jsonData
But, what's not clear to me, what should myMessage.payload contain?
- Uint8Array ?
- Not sure how we could convert JSON to bytes, as the logic for doing that is compile-time generated, and we didn't compile that type yet. Unless, we can use the type-registry for that, and only allow pre-compiled types? I'm not familiar with this area of Protobuf.
- Json Data?
{"@type": "...", "foo": "bar"}- This would be easy to implement for JSON ⇄ JSON scenarios, but then
MyMessage.encodewould again need to know how to convert the json to bytes, same problem as 1. Unless we decide not to support this (for now).
- This would be easy to implement for JSON ⇄ JSON scenarios, but then
Are there any other requirements?
@boukeversteegh ah okay, so I was looking at the protobuf Any examples in Java:
* Foo foo = ...;
* Any any = Any.pack(foo);
* ...
* if (any.is(Foo.class)) {
* foo = any.unpack(Foo.class);
* }
And I think I get it; originally I was assuming that, when using Any, a FooMessage that has a payload: Any would immediately "know" what that payload was after deserialization, like in an OO way you could do:
const foo = FooMessage.decode(bytes)
if (foo.payload instanceof BarMessage) {
...
}
Which, right would necessitate FooMessage.decode knowing how to dynamically access BarMessage.decode, and hence for us only work with the type registry in the output.
That said, looking at the Java example, it seems that Any in the protobuf ecosystem isn't actually that sophisticated, b/c the user needs to "bring your own polymorphism" (BYOP :-)) with these hand-coded any.is checks; whichlike for us I think it'd look like:
-
FooMessage.payloadactually is/stays anAnythat is{ typeUrl: string, value: Uint8Array }(basically as we would generateAnytoday- I.e. After doing
FooMessage.decode(bytes), thefoo.payload.typeUrlis a string andfoo.payload.valueis aUint8Array
- I.e. After doing
-
The user would do BYOP:
if (Any.is(foo.payload, BarMessage)) {
const bar = Any.unpack(foo.payload, BarMessage)
// really ^ is the same as:
const bar = BarMessage.decode(foo.payload);
// ...so maybe we don't need an Any.unpack
}
(Note that the Java example uses instance methods on any, like any.unpack / any.pack; I'm using static methods on Any b/c ts-proto's messages today are just data / don't have any methods... although, we could treat Any as a value/wrapper type and actually turn it into an instance with methods on it, similar to a Date or what not...)
- We would output a
typeUrlin each message'sconst(without requiring a full-blown type registry map):
export const BarMessage = {
typeUrl = "http://whatever this is";
}
Such that Any.is(any: Any, messageType: { typeUrl: string }) would essentially return any.typeUrl === messageType.typeUrl
Given we do ^, I think that would make bytes-based FooMessage.encode / FooMessage.decode work okay w/o a type registry, and just the addition of a typeUrl const in the output.
...that said, what I don't understand yet is how JSON based deserialization would work, as you'd already mentioned; i.e. using the same "the user hand-codes their polymorphism" / BYOP:
const jsonData = `
{ payload: { typeUrl: "...", firstName: "bob" } };
`;
const foo = FooMessage.fromJSON(jsonData);
// this we can do b/c payload.typeUrl exists...
if (Any.is(foo.payload, BarMessage)) {
// this we cannot do, b/c value isn't a byte[], it's a bunch of key/value pairs...
const bar = BarMessage.decode(foo.payload.value);
}
I wonder how the Java/C++ bindings solve this, i.e. where do they put these json key/value pairs between the time the FooMessage.fromJSON is called, and the BarMessage.fromMessage is called?
I suppose we could have fromJSON create an Any that was a JSON-special version, in that the payload.typeUrl is still the string, but payload.value is not really a byte[], it's the actual JSON object literal, which could only be used if you did BarMessage.fromJSON(foo.payload).
(Which would kind of make sense, just like BYOP means you have to hand-code "if .is(BarMessage)", you also have to hand-code "I 'just know' this came from JSON so use fromJSON".
Such that if we had an Any interface, it would look like:
interface Any {
typeUrl: string;
value: Uint8Array | object;
}
I.e. value is an Uint8Array if you're Any came from a FooMessage.decode / bytes world, but it'd be an object if it came from a FooMessage.fromJSON / json world, and you'd have to "just know" to use either BarMessage.decode or BarMessage.fromJSON as appropriate.
Granted, it does make hopping between formats kind of odd / impossible, i.e. if you do:
const foo = FooMessage.fromJSON(jsonData);
// how does it know the right thing to do?
const bytes = FooMessage.encode(foo);
// you'd probably first have to do
const foo = FooMessage.fromJSON(jsonData);
// figure out why foo.payload is and then...
foo.payload = BarMessage.fromJSON(foo.payload);
// now we can encode
const bytes = FooMessage.encode(foo);
Given ts-proto's goal is "idiomatic JS/TS", I wonder if maybe for Any support we should just assume a type registry so that we can do the "OO" approach of FooMessage.fromJSON(jsonData) immediately knows how to find BarMessage.fromJSON and so foo.payload is literally a BarMessage.
I think that is what I'd personally want, to have the most pleasant ergonomics when working with Any data...
It does assume that, at compile time, we must know all types that may go through payload, i.e. we would not be able to support like a "router" scenario where a pre-built daemon accepts messages with unknown-at-time-of-build Anys and is still able to serde them, i.e. while just doing "dumb proxying" of the messages...
I suppose we could combine both approaches, and foo.payload would be one of three values:
-
If you used a type registry and
typeUrlwas in it,foo.payloadwould immediately be theBarMessagetype. You can useFooMessage.encode(foo)andFooMessage.toJSON(foo)w/o any issues. -
If
typeUrlwas not in the type registry (or type registry was disabled), and you usedFooMessage.decode, thenpayloadwould be aUint8Array, and you'd have to manually convert it to the right type (BYOP). Trying to re-serializefooas JSON w/o doing that would fail at runtime (but you could re-serialize as bytes and we'd drop it on the wire as-is). -
If
typeUrlwas not in the type registry (or type registry was disabled), and you usedFooMessage.fromJSON, thenpayloadwould be anobject, and you'd have to manually convert it to the right type (BYOP). Trying to re-serializefooas bytes w/o doing that would fail at runtime (but you could re-serialize as JSON and we'd drop it on the wire as-is).
I dunno...I think ^ makes sense, but it sounds like a lot of work. :-D @boukeversteegh wdyt? I think where I ended up is probably the "most fancy aka expensive" but also "most ergonomic" solution...
@mvukov let us know if we're making this more complicated than it should be :-)
Great analysis @stephenh! The odd-looking sequence where things are only parsed partially, and filling in the blanks manually is something I wouldn't imagine users would be happy with.
Of course it's better than not being able to work with Any at all though. However, explaining it in the docs will also be difficult.
I would personally expect that FooMessage.payload to contain an Any object, with a separate instruction to decode it (for performance reasons). But I realize this is impossible, because sometimes Any is provided as Bytes (decode), and sometimes as JSON (fromJSON), so in one of those cases conversion needs to happen, in order to normalize it.
Unless, we would represent Any with two fields, and make it 'lazy' in both scenarios.
class Any {
private _jsonEncoded: any;
private _bytes: Uint8Array;
private _type: string;
toJSON() => this._jsonEncoded | // OR decode the bytes and then call <type>.toJSON()
}
I guess it's fine to make Any a bit smarter than the other messages (i.e. break the 'interface only' principle).
Is it actually possible to implement Any.is using <message>.$type? Because I think that will just store the typename, and not a URL, right? Or are those strings the equivalent?
I agree with your last point that we need feedback on what is really needed. It sounds indeed like a lot of work and especially without any experience of working with Any (for me at least), I wouldn't know how to evaluate the design choices.
PS: I have also never worked with the type registry, but if it makes things simpler (narrower scope) to require it for Any fields to work at all, I would say lets do it. We can always expand.
Any solutions?
As someone who uses this similar pattern in ts-proto, I've forgone "Any", and use "Value" + type registry hacks instead.
The primary reason is that Any feels only semi-supported in the binary encodings of proto3, at least according to the documentation. Value has clear semantics and encodings, and is a little less magic. https://developers.google.com/protocol-buffers/docs/proto3#any -- "Currently the runtime libraries for working with Any types are under development."
I suppose sooner or later, we'd port to Any if the support was good!
I was in urgent need for any parsing from js side, so only workaround for me was - to edit generated files manually - I've replaced : Any to : any in interfaces and from/to json functions. That way it can at least work, with some tricks and even without, in some use cases. Can you please add this to well known types, with option --ts_proto_opt=any=any or any=native so generator wont emit Any, but any type - typescript has native support for this scenario.. (while thinking of another solution) Thanks!
Please consider "@type" over typeUrl to remain with the canonical definition of Any.
Any update on this?