protobuf-ts
protobuf-ts copied to clipboard
Symbol missing from generated types
Codegen will create typescript definitions that do not include the MESSAGE_TYPE symbol, even though it will be present in all response messages.
Following the example in the manual for this definition:
message Person {
string name = 1;
uint64 id = 2;
int32 years = 3 [json_name = "baz"];
optional bytes data = 5;
}
I would expect that the typescript interface is something like:
import { MessageType, MESSAGE_TYPE } from '@protobuf-ts/runtime';
export interface Person {
/**
* @generated from protobuf field: string name = 1;
*/
name: string;
/**
* @generated from protobuf field: uint64 id = 2;
*/
id: bigint;
/**
* @generated from protobuf field: int32 years = 3 [json_name = "baz"];
*/
years: number;
/**
* maybe a jpeg?
*
* @generated from protobuf field: optional bytes data = 5;
*/
data?: Uint8Array;
// This what's missing
[MESSAGE_TYPE]: MessageType<Person>
}
The [MESSAGE_TYPE]
field is a non-enumerable property that is added to the object as a helpful "branding" that can be used for reflection purposes, but is not actually a required to make a valid (in this case) Person
message object. The idea to add this to the generated interface was also considered. The generated TS interface is defining the required shape of an object, but the object can have other fields and still pass type checking. For example this following would have no TS errors:
const personWithExtra = {
name: 'Foo',
id: 123n,
years: 123,
extra: 'bar' // not in interface
};
function onlyPerson(input: Person) {}
onlyPerson(personWithExtra); // no errors
There is a helpful type-guard containsMessageType
that you can import from the @protobuf-ts/runtime
which will adjust the type of object (in the truthy case) to include the non-enumerable MESSAGE_TYPE
property such that you can safely access it.
That function and the symbol should probably be documented in the manual as there doesn't seem to be any mention of it. Perhaps because it is still considered experimental: https://github.com/timostamm/protobuf-ts/blob/55a7d5588afad4a98e57dc35789d97650d96da43/packages/runtime/src/message-type-contract.ts#L6-L12
Thanks for the summary, @jcready. We have two symbol properties, the other one is "protobuf-ts/unknown":
https://github.com/timostamm/protobuf-ts/blob/44b0fe4b7fe202c4d8a51f785c60ce71ca717622/packages/runtime/src/binary-format-contract.ts#L79
It is used to store unknown fields during deserialization in the message instance. That way, old code can work on a message with added fields, and not lose any data during serialization. It is a core feature of Protobuf. If an implementation doesn't support that, it's not conformant.
Because of that, it seemed pretty harmless to add the experimental MESSAGE_TYPE
as well. It is added for each message, though, while "protobuf-ts/unknown" is only added if an unknown field is encountered.
@cortopy, how did you run into the property? Was it through unit tests?
I found about the MESSAGE_TYPE
symbol when fixing a bug in my frontend application. I use redux and some of the messages that come from server responses are directly stored in the redux store without further modification. I thought the message would include only the properties in the type and so I was surprised that when I was diffing the value stored in redux with one the app generates there were some discordance.
Even though Symbols are not enumerable, I would still prefer as much type correctness as possible. This was one of those nasty bugs that are difficult to pin down and so I had to write some logic to remove the symbol before putting in store. This wasn't enough to fix the bug and it may be the case that I didn't need the logic, but having "hidden" properties, whether enumerable or not, is not something I was expecting
@timostamm I wonder if we could avoid the whole mess of extra properties on the objects by using a WeakMap
to store the message type (and possibly the unknown fields). Something to consider for v3.
// runtime/src/message-type-contract.ts
const MessageTypeMap = new WeakMap<object, IMessageType<any>>();
export const getMessageType = <T extends object>(ref: T): IMessageType<T> | undefined => MessageTypeMap.get(ref);
// runtime/src/binary-format-contract.ts
const UnknownFieldMap = new WeakMap<object, UnknownField[]>();
export const getUnknownFields = (ref: object): UnknownField[] | undefined => UnknownFieldMap.get(ref);
// Usage
import { getMessageType, getUnknownFields, IMessageType, UnknownField } from '@protobuf-ts/runtime';
const MessageType: IMessageType<typeof someReference> | undefined = getMessageType(someReference);
if (MessageType) {
console.log(MessageType.toJson(someReference));
}
const UnknownFields: UnknownField[] | undefined = getUnknownFields(someReference);
if (UnknownFields) {
console.log(UnknownFields.map((uf) => uf.no));
}
Ah, WeakMap is definitely a nice idea. I've been using it back in the days in ActionScript and didn't realize JavaScript implementations finally caught up :)
Can't think of any drawbacks. We export both symbols, so it is a breaking change, but it seems worth it for a new major.
This would be useful for users of SvelteKit as well. The default way of passing data from client/server around in that framework gets tripped up on these symbols and requires you clone new objects without the symbols to work.
FWIW my suggested solution using WeakMap
does have some unintended consequences with regard to unknown fields. Currently the behavior of unknown fields is to set an ENUMERABLE Symbol
(unlike the message type symbol which is not enumerable) on the message object.
If we wish to replace the usage of Symbol
s with WeakMap
s then what do we do for unknown fields now that spreading the message object will no longer preserve the unknown fields? The manual specifically says that the way to drop unknown fields is to use MessageType.clone()
. But since spreading will no longer preserve the unknown fields I wonder if for the next major release if we should do the opposite (e.g. MessageType.clone/create/mergePartial()
will preserve unknown fields, while spreading will not).
Though that then begs the question: if one were to MessageType.mergePartial()
and both the target and source had referentially different unknown fields, do we concat?
I also want to point out that stamping the message with the Symbol
is rather expensive. When benchmarking binary reads (on nodejs v20) I'm seeing an almost 2.5x improvement if I comment out this bit from the create()
method:
globalThis.Object.defineProperty(message, MESSAGE_TYPE, { enumerable: false, value: this });
With stamp: 550,101 ops/sec ±1.24% (95 runs sampled)
Without stamp: 1,323,473 ops/sec ±0.87% (92 runs sampled)
To be fair, using the WeakMap
approach is even slower:
Using WeakMap: 311,061 ops/sec ±4.00% (63 runs sampled)
So far I've only found one way to maintain the current symbol stamp (and all its enumerability behaviors) while avoiding the slow down: define an EMPTY class for creating message instances and then define the symbol property on that class's prototype.
Inside the base MessageType
class constructor:
this.MessageInstance = class MessageInstance{};
globalThis.Object.defineProperty(this.MessageInstance.prototype, MESSAGE_TYPE, { value: this });
Then the create()
method for each generated class would replace
const message = { repeatedField: [] };
globalThis.Object.defineProperty(message, MESSAGE_TYPE, { enumerable: false, value: this });
with
const message = new this.MessageInstance();
message.repeatedField = [];
// alternatively
const message = Object.assign(new this.MessageInstance(), { repeatedField: [] });
Even though the defineProperty
bit is expensive we're now only performing it once per message "class" instead of once per message "instance".
I still think the WeakMap
approach could work and be fast so long as we used that MessageInstance
class approach like shown above (just without the symbol). The only difference is that instead of adding every message "instance" (this
) to the WeakMap
, we instead add the this.__proto__
to it.
Previous proposal:
// somewhere in @protobuf-ts/runtime
const MessageTypeMap = new WeakMap<object, IMessageType<any>>();
export const setMessageType = <T extends object>(ref: T, MessageType: IMessageType<T>): void => {
MessageTypeMap.set(ref, MessageType);
};
export const getMessageType = <T extends object>(ref: T): IMessageType<T> | undefined =>
MessageTypeMap.get(ref);
// generated file, Foo$Type extends MessageType<Foo>'s create() method:
const message = {};
setMessageType(message, this);
New proposal:
// somewhere in @protobuf-ts/runtime
const MessageTypeMap = new WeakMap<object, IMessageType<any>>();
export const setMessageType = <T extends object>(ref: T, MessageType: IMessageType<T>): void => {
MessageTypeMap.set(ref.__proto__, MessageType);
};
export const getMessageType = <T extends object>(ref: T): IMessageType<T> | undefined =>
MessageTypeMap.get(ref.__proto__);
// generated file, Foo$Type extends MessageType<Foo>'s create() method:
const message = new this.MessageInstance();
setMessageType(message, this);