ts-proto
ts-proto copied to clipboard
Support proto2 optional fields
Now that we have support for proto3 optional (#73, yay!), should we also add support for proto2's version, LABEL_OPTIONAL
? Right now something like optional uint32 index = 42
translates into index: number
but IMHO it should be index?: number
.
I'd be happy to write the code for this.
@philikon sure, that sounds like a good idea. Thanks!
I am using a patch of ts-proto
in Bazel + NestJS to achieve this and opened PR #162 with some additional changes on top of what my patch had, to modify fromPartial
and fromJson
and I think it should be ok
I looked a little bit at this, and the proto2 docs on optional
:
https://developers.google.com/protocol-buffers/docs/proto#optional
When a message is parsed, if it does not contain an optional element, the corresponding field in the parsed object is set to the default value for that field.
So, if we have an optional foo
proto, should it become a foo: number
b/c when decoded off the wire, it will (?) always be set to 0
if not present, or should it be foo?: number
and left as undefined?
Fwiw my guess is that users mostly expect foo?: number
but the docs insinuate otherwise.
Another option would be to have it be foo: number
but get the default value from a prototype, so that you could still check for presence via message.hasOwnProperty('foo')
. That is probably where I would lean.
Optionals are not deserialized into default values. The reader should check if there are any bytes at the field position and if not it gets turned into the language representation of undefined. For ex in JS that's undefined
, in Scala it's an Optional of None
, etc
Optionals are not deserialized into default values.
I mean, I agree intuitively, that's just not what the proto docs I linked to above assert...
I.e. you say "turned into the language representation of undefined
" but they say "turned into the field's default value".
Maybe the nuance is that in Scala/class-based messages, the internal field can be undefined
but the generated getter function could still return the default value?
Dunno. I mean, I generally agree with you, and maybe we just do that anyway, even if its not technically what the docs assert (or I'm misunderstanding them).
I mean, I agree intuitively, that's just not what the proto docs I linked to above assert...
I believe it is. The docs say
well-formed message may or may not contain an optional element. When a message is parsed, if it does not contain an optional element, the corresponding field in the parsed object is set to the default value for that field.
the key words to me being that the default is set when the message does not contain an optional element
for example, i am pretty sure when using just the protoc compiler with js_out
to output dts+js
optional fields are also not deserialised into default values when the reader is a proto2 reader. It has been a few months since i ran the protoc with js_out
but i am pretty sure thats what happens. I've also been using this in production this way against java/scala/python gRPC systems and they do not set default values when reading optional fields off the wire that have not been set.
...so you're saying that the message would actually contain the optional element, i.e. on the wire somewhere is the data that "fieldA
is not set"?
I.e. fieldA
is actually in the protobuf bytes, it's just in there as "unset".
Fwiw that's not my understanding of how the message serialization works, i.e. a primitive is either a) included and has some set value, or b) is not included and so has no value.
Can you point to docs on how to serialize/deserialize "this element/field (a primitive) is included but not set"?
...so you're saying that the message would actually contain the optional element, i.e. on the wire somewhere is the data that "fieldA is not set"? I.e. fieldA is actually in the protobuf bytes, it's just in there as "unset".
No, my understanding of it is that the readers know what the message contract is, they know what field positions to look for for bytes and whether they are optional or not, therefore the readers know that when there is nothing in a particular field position, and its an optional field, to set it to the language representation of unset, be that undefined
, None
or whatever. I am basing this on inspecting various generated readers in various languages generated by protoc. This is different in proto3, but they have a new experimental optional flag for proto3 that I believe yields the same functionality.
This is from the proto2 docs I linked to above:
"When a message is parsed, if it does not contain an optional element, the corresponding field in the parsed object is set to the default value for that field."
Which starts off sounding a lot like your "when there is nothing in a particular field position ... " except then you change to "...set it to the language representation of unset", which I can't find any description of that behavior on that proto2 page.
Fwiw the java tutorial has:
// protoc
optional string email = 3;
// java
public boolean hasEmail();
public String getEmail();
Which matches my recollection of proto2 from when I used it before: the getEmail()
getter when unset still returns ""
(which is what that proto2 doc snippet I referred to is talking about) and you have to specifically use the hasEmail()
"hazzer" to check for "was this ""
really sent or am I just getting a default".
I.e my point is generally that class-based languages have two ways (methods) of accessing email
: one getter that always returns the default value for the primitive type and one "hazzer" (their term) that says "was this actually present or are you being fed a default".
It's hard for ts-proto to provide two of these capabilities without something like a prototype and hasOwnProperty
. Which we can do (ts-proto originally worked that way even for non-optional values) but it's, dunno, just slightly different b/c then Object.keys
doesn't have the email
field in it (in this example) b/c it's set on the prototype. Object.keys
and entries
breaking is why I moved ts-proto off of this approach.
I just checked the Java and C++ tutorials on that page, and C++ also looks like it does the "getter is an empty string + separate hazzer method".
Can you link to generated code for proto2 optional
in other dynamic languages? My guess is that they will all have getters and hazzers.
Oof 🤦 so i looked into this today and it turns out we had some patches to generators where I work that made it function as i described, but you are correct in your description of how the protobuf spec is intended to work.
Lol, well, that explains why we were talking in circles. :-D
Fwiw I get why you're are doing it that way, I personally think it makes more sense...
But not sure if that means ts-proto should do it.
I suppose it could end up being another option flag.
Wanted to try and convince you to either make it optional indeed for proto2 optionals or at least provide a flag which does this. Indeed proto2 expects deserializers to return the default value when accessed directly (i.e. via getter) and add a hazzer function which checks whether the value was actually set. This is because proto2 has custom default values and these are implemented by the deserializer (which is why it can't return "undefined" if a value doesn't exist).
Because your generator creates plain ts interfaces and can not provide a hazzer I think that the correct thing to do it make those field ts options (i.e. ?
) so the reader could actually differentiate between a set and unset value and it will be up to the reader to use the default value when the property isn't set.
@eyalpost I agree the lack of hazzers is unfortunate; I've historically gone back/forth on "would clients prefer being able to detect not-present" or "would clients prefer not having to 'or' their own default value", and kind seems like at the end-of-the-day it's going to be on a case-by-case basis.
That said, I've also felt this pain in ts-proto itself, as the protobuf descriptors themselves have an optional int
field oneof_index
that is a) a proto2 file, b) an optional int field, and c) the values are 0
-based, so if we read 0
with ts-proto's current approach, we just cannot tell "...is your index 0? or did you mean unset?".
If you'd like to submit a PR that changes the behavior out-right to index?: number
, as per @philikon 's original request, we could merge it and see how many users request the old behavior, and then implement another flag. :-) :shrug:
@ssilve1989 can you share example how did you configure ts-proto with Bazel? Thanks
@deser We wrote an aspect based on the one found here: https://github.com/bazelbuild/rules_nodejs/blob/4.x/packages/labs/grpc_web/ts_proto_library.bzl
although aspect-based proto compilation does not look to be the recommended way of doing this anymore so we will be switching over to use an implementation by stackb/rules_proto
when it is implemented
Thanks :)
It would be great to have the same behavior as optionals in proto3, i.e. { scalarOptional: number | undefined }
@val-o I'm traveling so don't have time to fully remember/study the issue, but iirc / kinda guessing that @Vilsol 's PR from last year fixes this:
https://github.com/stephenh/ts-proto/pull/484/files#diff-ad74198e9093dc9e981de421c0e333bb02e31091258228b27ec0dfe1932815a0R279
And we just forgot to close this issue out... @Vilsol does that sound right to you?
As a data point, I stumbled on this recently while using the descriptor generated by this library with another proto library: https://github.com/andrewhickman/prost-reflect/issues/49
Fwiw, per an earlier update, I'm 95% sure this has been fixed b/c ts-protos own/internal ts-proto-descriptors
library that we use for reading messages from the protoc compiler handles oneofIndex
being an proto2 optional field via a hazzer check.
@stephenh in my case I'm not reading the proto descriptor using this library, but I'm serializing the descriptor using this library and then consuming it from another library.
Hm, @slinkydeveloper are you using the usePrototypeForDefaults
option? The oneof_index
in your bug report is exactly what we had to handle too, although now I'm wondering if we handled the reading correctly, but not the writing, i.e. our encode
methods that turn messages to bytes may always be outputting oneof_index=0
even if the 0
is the default from the prototype...
Lmk if you're using usePrototypeForDefaults
, and if so, yeah, we probably need a new issue to track writes not serializing the default.