protobuf-ts icon indicating copy to clipboard operation
protobuf-ts copied to clipboard

Ideas

Open timostamm opened this issue 5 years ago • 8 comments

This is a collection of ideas for further development:

  • Generate clients for @grpc/grpc-js (see #57)

  • Provide a "backend" for grpc-web, similar to the grpc-backend

  • Support Enum / Enum value options (see #37)

  • ~Convenience functions to read service and message options (see #36)~

  • ~Support message options (see #35)~

  • support validate.proto

    • see https://github.com/envoyproxy/protoc-gen-validate (already has a test harness)
    • function validate<T>(message: T, IMessageType<T> type): string | undefined
    • make available in @protobuf-ts/runtime or @protobuf-ts/validate
  • run test-generated specs in web browser too

  • benchmark code-size for grpc-web and twirp

  • globalThis on nodejs < 12.0.0: use polyfill or show better error message
    https://stackoverflow.com/questions/29349684/how-can-i-specify-the-required-node-js-version-in-package-json

  • rename oneof discriminator from "oneofKind" to "oneof"

  • grpc web interop-tests:

    • https://github.com/grpc/grpc-web/tree/master/test/interop (docker; slow setup; difficult testing)
    • https://github.com/grpc/grpc-web/blob/master/doc/interop-test-descriptions.md
    • https://github.com/grpc/grpc-web/blob/master/docker-compose.yml
    • https://github.com/grpc/grpc-web/tree/master/net/grpc/gateway/docker
  • field mask utils

    • FieldMask.normalize(mask: FieldMask): FieldMask Converts a FieldMask to its canonical form. In the canonical form of a FieldMask, all field paths are sorted alphabetically and redundant field paths are removed.
    • FieldMask.union(firstMask: FieldMask, ... otherMasks: FieldMask[]): FieldMask Creates a union of two or more FieldMasks.
    • FieldMask.intersection(s: FieldMask, b: FieldMask): FieldMask Calculates the intersection of two FieldMasks.
  • generate IMessageType<T>.equals(a: T, b: T);

  • generate json I/O methods

  • IMessageType<T>.extract(message: T, mask: FieldMask): PartialMessage<T>; Returns a partial message that has only the fields specified in the mask.

  • IMessageType<T>.diff(a: T, b: T): FieldMask; Creates a field mask with all fields that are different between a and b.

  • IMessageType<T>.mergeMessage(target: T, source: T, mask: FieldMask): T; convenience for doing extract() and mergePartial()

  • generate "typeName" discriminator property in interfaces? a symbol?

  • prevent local field name clashes, see book.proto TestPrimitiveFieldsWithSameJsonName

  • proto compiler in typescript

    • ebnf: https://developers.google.com/protocol-buffers/docs/reference/proto3-spec

timostamm avatar Dec 23 '20 17:12 timostamm

For what's it's worth, https://developers.google.com/protocol-buffers/docs/reference/proto3-spec isn't accurate, not even close actually heh.

bufdev avatar Apr 01 '21 16:04 bufdev

@timostamm

generate "typeName" discriminator property in interfaces? a symbol?

I'm trying to convert a legacy protobuf.js-based system to protobuf-ts. The biggest problem that I've encountered is that protobufjs generates "classes"; given a message object, you can access its type name, and this feature is used in the system (ex.: messages can be passed to a logger as a part of a deeply nested structure, and the logger applies redaction rules based on sensitivity options attached to fields in proto).

Do you think it's ok to have something like

export function reflectionCreate(info: MessageInfo): any {
    // ....
    return { ...msg, [Symbol("protoReflection")]: info };
}

to brand the created objects?

You are suggesting adding typeName, but without a central registry, it's hard to get from a name to the reflection information.

odashevskii-plaid avatar Aug 27 '21 02:08 odashevskii-plaid

I've been thinking about this feature for a while, @odashevskii-plaid.

In my experiments, I found that it would be useful to have a symbol property (acting as a type branding) pointing to the IMessageType.

There is a downside with adding this property to the generated interfaces. It will be less obvious to users how to create such an object.

On the upside, having to pass the message as well as its message type - which you need to do most of the time for a function that acts generically on a message - would become unnecessary. And it allows for some dramatic changes in the public API. IMessageType as the public API to work with messages could be replaced by simple functions. For example, instead of having to write Foo.toBinary(foo), you would simply use toBinary(foo). toBinary would look at the type property and retrieve the reflection information (or the speed-optimized internalBinaryWrite).

This would ultimately allow the JSON format to be omitted from the webpack build output, as long as toJson() isn't used in the code, further reducing code size. Same for many other methods of IMessageType.

So I see technical benefits, and also benefits for the developer experience. There are quite a few things left to figure out though, and I am not entirely sure how the downside will manifest.

timostamm avatar Aug 29 '21 15:08 timostamm

Possibly provide a new package for parsing the currently unspecified text protobuf format.

jcready avatar Sep 14 '21 19:09 jcready

Regarding the FieldMask stuff on the IMessageType: wouldn't that require that the runtime (I assume) had access to a pre-generated generated FieldMask MessageType? How would that work?

jcready avatar May 16 '22 20:05 jcready

Probably just a FieldMaskLike interface. It's a pretty simple definition, if you look at the fields. Field masks are a nice idea, but the spec is odd for repeated fields, and using field names instead of field numbers makes it more brittle than necessary.

timostamm avatar May 17 '22 10:05 timostamm

While investigating the FieldMask stuff I noticed that the existing implementations (namely the Java and Python versions) both have "merge options" wherein the caller is able to configure the merge strategy. For example repeated fields on the source can either replace or append-to the target fields.

Another thing I noticed is that both implementations end up relying upon their Message#Merge() once reaching a "leaf" of the FieldMask when that leaf refers to a singular message field. This leads to a few complications assuming we'd like to have matching behaviors with regard to the unit tests.

So if someone like myself wanted to take a stab at implementing some of the FieldMask work you outlined above they'd be left with a choice to do one of the following:

  1. Ignore the existing implementations for FieldMask's merge behaviors and create unit tests which wouldn't match.
  2. Attempt to match the existing implementations' default merge behavior which means zero reliance on reflectionMergePartial (as it will "replace" repeated fields by default instead of "append") which means a lot of nearly identical code.
  3. Add some kind of new "merge options" to reflectionMergePartial which would default to the current reflectionMergePartial behavior, but allow the FieldMask's merge behavior to match the existing implementations' behavior by passing a custom "merge options" object when calling MessageType.mergePartial() on a "leaf".

Maybe something like

export enum MergeRepeated {
    /** 
     * Source will replace target
     * ```ts
     * target = [{ a: 9, b: true }, { a: 8, b: true }]
     * source = [{ a: 1          }                   ]
     * result = [{ a: 1          }                   ]
     * ```
     */
    REPLACE = 0,
    /** 
     * Source will append to target
     * ```ts
     * target = [{ a: 9, b: true  }, { a: 8, b: true }          ]
     * source = [                                       { a: 1 }]
     * result = [{ a: 9, b: true  }, { a: 8, b: true }, { a: 1 }]
     * ```
     */
    APPEND = 1,
    /** 
     * Source will overwrite target by index
     * ```ts
     * target = [{ a: 9, b: true }, { a: 8, b: true }]
     * source = [{ a: 1          }                   ]
     * result = [{ a: 1          }, { a: 8, b: true }]
     * ```
     */
    SHALLOW = 2,
    /** 
     * Source will deeply merge target by index
     * ```ts
     * target = [{ a: 9, b: true }, { a: 8, b: true }]
     * source = [{ a: 1          }                   ]
     * result = [{ a: 1, b: true }, { a: 8, b: true }]
     * ```
     */
    DEEP = 3,
}

export enum MergeMap {
    /** 
     * Source will replace target
     * ```ts
     * target = { foo: { a: 9, b: true }, bar: { a: 8, b: true }                }
     * source = {                                                 baz: { a: 1 } }
     * result = {                                                 baz: { a: 1 } }
     * ```
     */
    REPLACE = 0,
    /** 
     * Source will overwrite target by key
     * ```ts
     * target = { foo: { a: 9, b: true  }, bar: { a: 8, b: true }                }
     * source = { foo: {       b: false },                         baz: { a: 1 } }
     * result = { foo: {       b: false }, bar: { a: 8, b: true }, baz: { a: 1 } }
     * ```
     */
    SHALLOW = 1,
    /** 
     * Source will recursively merge
     * ```ts
     * target = { foo: { a: 9, b: true  }, bar: { a: 8, b: true } }
     * source = { foo: {       b: false },                         baz: { a: 1 } }
     * result = { foo: { a: 9, b: false }, bar: { a: 8, b: true }, baz: { a: 1 } }
     * ```
     */
    DEEP = 2,
}

export interface MergeOptions {
    /**
     * @default MergeMap.SHALLOW
     */
    map?: MergeMap;
    /**
     * If false message fields will be replaced instead of recursively merged (if the source field is set).
     * @default true
     */
    message?: boolean;
    /**
     * @default MergeRepeated.REPLACE
     */
    repeated?: MergeRepeated;
}

jcready avatar Jun 04 '22 14:06 jcready

Some context - from field_mask.proto:

If a repeated field is specified for an update operation, new values will be appended to the existing repeated field in the target resource.

This merge behavior runs pretty deep in protobuf. It is also used in the binary format, meaning that deserializing a message multiple times into the same object will overwrite singular (scalar) fields, but append to repeated fields.

internalBinaryRead() does just that, but we do not have a matching merge() function that operates on two message instances. An implementation of FieldMask operations should definitely be compatible with this behavior. For example, a client might send a delta update for an entity using a field mask, and that only works if the algorithms on server and client match.

In my opinion, this merge behavior is really useful for the binary format, but not ideal for a general-use merge() function. The merge options in the Java and Python implementation seems to support that assessment. I think it's especially true for JavaScript, where we are used to Object.assign(), Lodash's merge(), and many others. That's why the only merge function we offer is MessageType.mergePartial().

Looking at the notes on field masks in the issue description, It looks like MessageType.mergeMessage() relying on extract() and mergePartial() would be a problem, and adding methods to MessageType should probably be avoided so that they don't increase code size for users who don't use field masks.

The two most important operations related to field masks seem to be:

  • Update a target message from a source message + field mask
  • Create a diff between two messages, resulting in a field mask + a message where only the different fields are present

I think just the implementation of those two pieces would be a substantial step forward. They could simply be two standalone functions that use the protobuf merge behavior. A follow-up could add merge options and then we should probably look into sharing code with mergePartial(), but I'm not sure they would be necessary for a start.

timostamm avatar Jun 08 '22 08:06 timostamm