protobuf-ts icon indicating copy to clipboard operation
protobuf-ts copied to clipboard

Update `oneof` representation

Open jcready opened this issue 3 years ago • 18 comments

import { getOneofValue } from '@protobuf-ts/runtime';

interface ISomeMessage {
    foo: string;
}

type TSomeMessage = {
    foo: string;
}

declare var oneofWithInterface: {
    oneofKind: "a";
    a: ISomeMessage;
} | {
    oneofKind: undefined;
};

declare var oneofWithType: {
    oneofKind: "a";
    a: TSomeMessage;
} | {
    oneofKind: undefined;
};

const t = getOneofValue(oneofWithType, 'a'); // TSomeMessage | undefined
const i = getOneofValue(oneofWithInterface, 'a'); // TS error
// Argument of type '{ oneofKind: "a"; a: ISomeMessage; } | { oneofKind: undefined; }' is not assignable to parameter of type 'UnknownOneofGroup'.
//   Type '{ oneofKind: "a"; a: ISomeMessage; }' is not assignable to type 'UnknownOneofGroup'.
//     Property 'a' is incompatible with index signature.
//       Type 'ISomeMessage' is not assignable to type 'string | number | bigint | boolean | Uint8Array | UnknownMessage | undefined'.
//         Type 'ISomeMessage' is not assignable to type 'UnknownMessage'.
//           Index signature is missing in type 'ISomeMessage'.(2345)

The issue seems like some discrepancy between a type and interface and the use of index signatures. As shown above the TSomeMessage type works just fine, but the identical interface causes issues. The issue can be solved in one of two ways:

  1. Change the code generation to output types for Messages. (this seems bad)
  2. Change UnknownMessage to be an interface. (this seems much better)

jcready avatar Aug 15 '21 15:08 jcready

Thanks for the report and investigation! This confirms again that even the smallest feature requires overly diligent testing.

Changed UnknownMessage from type to interface in v2.0.2.alpha-0`.

timostamm avatar Aug 15 '21 17:08 timostamm

I realize this would require a major version bump, but I'm curious if oneofs might be better (or at least easier to operate on in the world of TypeScript) represented as tuples. This would pretty much eliminate the need to make specialized function to get/set/clear the oneofs on a message.

type UnknownOneofGroup = [
    string,
    UnknownScalar | UnknownEnum | UnknownMessage,
] | [];

type ExampleOneof =
  | ['a', string]
  | ['b', number]
  | ['c', boolean]
  | [];

interface ExampleMessage {
  example: ExampleOneof;
}

declare var message: ExampleMessage;

// Get
if (!message.example[0]) {
  throw new Error("No kind specified");
}
const [
  kind, // "a" | "b" | "c"
  value // string | number | boolean
] = message.example;

// Set
message.example = ['b', 1];

// Clear
message.example = [];

A couple of helper constants could be exported for the fun of it.

// oneof.ts
export const KIND = 0;
export const VALUE = 1;
import { KIND, VALUE } from '@protobuf-ts/oneof';

// Get
if (!message.example[KIND]) {
  throw new Error("No kind specified");
}
const value = message.example[VALUE]; // string | number | boolean
const kind = message.example[KIND]; // "a" | "b" | "c"

jcready avatar Aug 15 '21 21:08 jcready

Actually, now I'm curious why the oneofs are structured the way they are currently.

In protobufjs the oneof identifier could be used as the key of a property whose value was the key of the actual value property. So in my above example message the client would see something like:

const message = {
  example: 'b',
  b: 1,
};

This would allow the consumer to just get the value with: message[message.example].

But protobuf-ts specially does not put the oneof key/value inside the parent object like that. It makes a separate object so that the generated union is confined to just the oneof and not the entire message:

const message = {
  example: {
    oneofKind: 'b',
    b: 1,
  }
}

But because of TypeScript you don't actually get any benefit from being able to do something like this to get the value: message.example[message.example.oneofKind]. So I guess my question is: why bother making the value of the oneof keyed off the identifier? Oneof could much more simply be represented with an object with just two static properties: kind and value:

const message = {
  example: {
    kind: 'b',
    value: 1,
  }
}

This would have the exact same benefits my previous comment showcased, but is arguably more ergonomic since it avoids using arrays for something they probably shouldn't be used for since order doesn't actually matter.

type UnknownOneofGroup = {
    kind: string,
    value: UnknownScalar | UnknownEnum | UnknownMessage,
} | { kind?: undefined, value?: undefined };

type ExampleOneof =
  | { kind: 'a', value: string }
  | { kind: 'b', value: number }
  | { kind: 'c', value: boolean }
  | { kind?: undefined, value?: undefined };

interface ExampleMessage {
  example: ExampleOneof;
}

declare var message: ExampleMessage;

Get
if (!message.example.kind) {
  throw new Error("No kind specified");
}
const {
  kind, // "a" | "b" | "c"
  value // string | number | boolean
} = message.example;

// Set
message.example = { kind: 'b', value: 1 };

// Clear
message.example = {};

jcready avatar Aug 16 '21 21:08 jcready

Now that looks interesting!

If I remember correctly I was hesitant to omit the field names - in all other language implementations I am aware of, oneof members are siblings to regular fields and doing away with the field name probably felt a bit too drastic. After having used the generated types for some time however, your proposal does look very attractive.

It's not completely trivial to use kind instead of oneofKind, because the PartialMessage type uses the discriminator to identify oneof groups, so that users cannot provide partial oneof groups. We are escaping field name oneofKind - which is very unlikely to occur anyway - kind however is much more likely to exist as a field name. That said, { kind: 'b', value: 1 } is very succinct and may be worth that tradeoff.

I'm not opposed to an early 3.0 :) we should explore this.

timostamm avatar Aug 16 '21 22:08 timostamm

I'm not at all attached to kind. Keeping it oneofKind works just as well. The important thing is that the the other key in the object is static, the one that holds the value. I'm also not attached to value being the key for that either :)

type ExampleOneof =
  | { oneofKind: 'a', value: string }
  | { oneofKind: 'b', value: number }
  | { oneofKind: 'c', value: boolean }
  | { oneofKind?: undefined, value?: undefined };

jcready avatar Aug 16 '21 23:08 jcready

But I'm already attached :)

The oneofKind discriminator always felt a bit like a workaround to me. I'd wager many users ask themselves why the discriminator is named that way when they encounter it for the first time. { kind: 'b', value: 1 } does not appear to have that problem.

Regarding name clashes: In the test fixtures, there's http.proto and type.proto with a field named kind. It should be possible to look at all message fields, and only escape when just kind and value are present. That would bring the likelihood of clashes down to the same ballpark as oneofKind.

timostamm avatar Aug 17 '21 08:08 timostamm

Implementation notes. It's a long list, but many items will be trivial.

Runtime:

  1. Use intermediate type { oneofKind: ... , oneofValue: ... } in UnknownOneofGroup, isOneofGroup() and the accessor utility functions
  2. Take the generated output of msg-oneofs.proto as a candidate and change the ADT manually to the new representation
  3. Update reflectionCreate, reflectionMergePartial, reflectionEquals and ReflectionTypeCheck
  4. Update ReflectionJsonReader, ReflectionJsonWriter
  5. Update ReflectionBinaryReader, ReflectionBinaryWriter
  6. Validate that the API has no regressions with the manually changed ADT

Plugin:

  1. Update code generator for the oneof union type (method createOneofADTPropertySignature of MessageInterfaceGenerator)
  2. Update secondary code generation that accesses oneofs (WellKnownTypes, GoogleTypes, Create (?), InternalBinaryRead, InternalBinaryWrite)
  3. Validate that the API has no regressions with the new generated ADT

Polish:

  1. Improve name clash resolving, special case messages that only have the two ADT properties as field names
  2. Switch to { kind: ... , value: ... }
  3. Test and document

timostamm avatar Aug 17 '21 08:08 timostamm

I've been taking a shot at making these changes this weekend and I'm wondering why a field name matching the oneofKindDiscriminator needs to be escaped if it is not part of a oneof group? Is it just to ensure the runtime isOneofGroup works correctly? And if we need that to work correctly I don't love having to escape any field named kind as that is far more likely to be the name of a protobuf field vs. oneofKind.

jcready avatar Apr 03 '22 17:04 jcready

Ah, re-reading through the thread I see now this bit:

PartialMessage type uses the discriminator to identify oneof groups

That's a bit unfortunate. The EBNF for both proto2 and proto3 specify that a field name must conform to /^[a-z][a-z0-9_]+$/i so it would be trivial to pick a discriminator that didn't match that. As clean as kind is, it could be nearly as clear (and avoid fields getting "escaped" for collisions) if we went with something else. Some possible options:

  • { _: 'b', value: 1 }
  • { $: 'b', value: 1 }
  • { _kind: 'b', value 1 }
  • { $kind: 'b', value: 1 }

jcready avatar Apr 03 '22 18:04 jcready

One thing that is confusing to me is why oneofKindDiscriminator is even an option that can be set for the interpreter. Since that value is never communicated to the runtime it means that if was set to any other value when invoking the interpreter then none of the runtime code would actually work correctly with the generated code.

Edit: Technically both the "kind" and "value" property names of a oneof group could be defined and exported once inside the runtime and everywhere else (including the generated code) could simply index into oneof groups using those exported strings.

// runtime/src/unknown-types.ts
import {oneofKind, oneofValue} from "./oneof";

// ...later

/**
 * A unknown oneof group. See `isOneofGroup()` for details.
 */
 export type UnknownOneofGroup = {
    [oneofKind]: undefined | string;
    [oneofValue]?:
        | UnknownScalar
        | UnknownEnum
        | UnknownMessage;
};
// runtime/src/oneof.ts
export const oneofKind = '$kind';
export const oneofValue = 'value';
export type OneofKind = typeof oneofKind;
export type OneofValue = typeof oneofValue;

// Assert our values above match the UnknownOneofGroup type, then garbage collect
({
    [oneofKind]: oneofKind,
    [oneofValue]: oneofValue,
} as UnknownOneofGroup);

The plugin and plugin-framework packages already import the runtime. Then, instead of generating code that uses "dot" notation to work with oneof (e.x. foo.$kind) you just make it use "square bracket" notation to index into the oneof (e.x. foo[rt.oneofKind]):

// message.result = {
//     [rt.oneofKind]: "err",
//     [rt.oneofValue]: reader.string()
// };
scalarOneof(field: rt.FieldInfo & { kind: "scalar" | "enum"; oneof: string; repeat: undefined | rt.RepeatType.NO }): ts.Statement[] {
    let oneofKind = this.imports.name(source, 'oneofKind', this.options.runtimeImportPath);
    let oneofValue = this.imports.name(source, 'oneofValue', this.options.runtimeImportPath);
    let type = field.kind == "enum" ? rt.ScalarType.INT32 : field.T;
    let longType = field.kind == "enum" ? undefined : field.L;
    return [ts.createExpressionStatement(ts.createBinary(
        ts.createPropertyAccess(ts.createIdentifier("message"), field.oneof),
        ts.createToken(ts.SyntaxKind.EqualsToken),
        ts.createObjectLiteral(
            [
                //     [rt.oneofKind]: "err",
                ts.createPropertyAssignment(
                    ts.createComputedPropertyName(ts.createIdentifier(oneofKind)),
                    ts.createStringLiteral(field.localName))
                ),
                //     [rt.oneofValue]: reader.string()
                ts.createPropertyAssignment(
                    ts.createComputedPropertyName(ts.createIdentifier(oneofValue)),
                    this.makeReaderCall("reader", type, longType)
                )
            ],
            true
        )
    ))];
}

jcready avatar Apr 09 '22 15:04 jcready

One thing that is confusing to me is why oneofKindDiscriminator is even an option that can be set for the interpreter.

It also has some other options that can't be changed without changes to other packages - it's just good style to not repeat string literals all over, and instead of global const, I chose an explicit constructor argument :shrug:

foo[rt.oneofKind]

Possible, but maybe not the most convenient regarding IDE autocompletion.

timostamm avatar Apr 23 '22 12:04 timostamm

That's a bit unfortunate.

For sure.

From a comment above:

It should be possible to look at all message fields, and only escape when just kind and value are present.

But that might be a bit much magic. Going to think about it about alternatives.

timostamm avatar Apr 23 '22 13:04 timostamm

It also has some other options that can't be changed without changes to other packages - it's just good style to not repeat string literals all over, and instead of global const, I chose an explicit constructor argument 🤷

But like I mentioned above, if you constructed the interpreter with any other value then none of the TS types would match the actual output (like UnknownOneof)

foo[rt.oneofKind]

Possible, but maybe not the most convenient regarding IDE autocompletion.

This would only be for internal use (although, external consumers could choose to use those constants if they wanted to). But generally for an external consumer they would still just be able to access the oneof group fields with foo.$kind and foo.value.

Additionally it means that the plugin version and plugin-framework version could remain unchanged if we ever needed to update those property names again. Bonus: if a consumer wanted to choose their own oneof group field names they'd only need to patch the runtime. The plugin/plugin-framework would just work.

It should be possible to look at all message fields, and only escape when just kind and value are present.

But even still, "kind" and "value" are going to be much more likely to show up in actual proto message field names than "oneof_kind". It also would require more logic to find and escape when only those two fields are present (possibly significant changes to the existing method of field name escaping). Comparatively, just renaming oneofKind to something that was impossible to show up in a proto message field name is much easier (e.g. $kind).

jcready avatar Apr 23 '22 15:04 jcready

I spent some time trying to see if my idea of using element access using a constant string defined in the runtime, but apparently it won't work because TS is unable to properly discriminate when using a non-literal (but constant) string to index into an object :\

Playground

const oneofKind = "$kind" as const;
const oneofValue = "value" as const;

type ExampleOneof =
    | { [oneofKind]: "a"; [oneofValue]: string; }
    | { [oneofKind]: "b"; [oneofValue]: number; }
    | { [oneofKind]: "c"; [oneofValue]: boolean; }
    | { [oneofKind]: undefined; };

declare const exampleOneof: ExampleOneof;

if (exampleOneof.$kind === "b") {
    const b = exampleOneof[oneofValue]; // OK
}
if (exampleOneof['$kind'] === "b") {
    const b = exampleOneof[oneofValue]; // OK
}
if (exampleOneof[oneofKind] === "b") {
    const b = exampleOneof[oneofValue]; // Error: Element implicitly has an 'any' type because expression of type '"value"' can't be used to index type 'ExampleOneof'.
}

jcready avatar Apr 24 '22 15:04 jcready

This is kind of random, but I found a way to generically convert between the current oneof shape and the new one (see it in action: Playground):

type OldExampleOneof =
  | { oneofKind: 'a', a: string }
  | { oneofKind: 'b', b: number }
  | { oneofKind: 'c', c: boolean }
  | { oneofKind: undefined };

type NewExampleOneof =
  | { kind: 'a', value: string }
  | { kind: 'b', value: number }
  | { kind: 'c', value: boolean }
  | { kind: undefined };

type OneofHelper<
    T,
    K extends T extends { oneofKind: keyof T } ? T['oneofKind'] : never
> = T extends { oneofKind: K } ? T[K] : never;
export function helpfulOneof<
    T extends { oneofKind: string | undefined; [k: string]: unknown },
    K extends T extends { oneofKind: keyof T } ? T['oneofKind'] : never,
    M extends { [k in K]: { kind: k; value: OneofHelper<T, k> } }
>(oneof: T): M[keyof M] | { kind: undefined } {
    const kind = oneof.oneofKind;
    if (!kind) {
        return { kind: undefined };
    }
    const value = oneof[kind];
    // @ts-ignore
    return { kind, value };
}

type FromOneofHelper<
    T,
    K extends T extends { kind: string } ? T['kind'] : never
> = T extends { kind: K; value: unknown } ? T['value'] : never;
export function fromHelpfulOneof<
    T extends { kind: string | undefined; value?: unknown },
    K extends T extends { kind: string } ? T['kind'] : never,
    M extends { [k in K]: { oneofKind: k } & { [p in k]: FromOneofHelper<T, k> } }
>(oneof: T): M[keyof M] | { oneofKind: undefined } {
    const oneofKind = oneof.kind;
    if (!oneofKind) {
        return { oneofKind: undefined };
    }
    // @ts-ignore
    return { oneofKind, [oneofKind]: oneof.value };
}

function onlyOldOneof(example: OldExampleOneof) {}
function onlyNewOneof(example: NewExampleOneof) {}

declare const oldExample: OldExampleOneof;
declare const newExample: NewExampleOneof;

// old to new
const oldToNew = helpfulOneof(oldExample);
onlyNewOneof(oldToNew);

// new to old
const newToOld = fromHelpfulOneof(newExample);
onlyOldOneof(newToOld);

// round-trip
onlyOldOneof(fromHelpfulOneof(helpfulOneof(oldExample)));
onlyNewOneof(helpfulOneof(fromHelpfulOneof(newExample)));

jcready avatar Apr 28 '22 15:04 jcready

I'm currently taking a stab at using the library, and so far it's been quite nice. Especially the fact that it ships the protoc compiler made setting it up a real breeze.

This oneof stuff is currently the most annoying bit, especially the not-typical structure makes using it (while remaining type safe) really hard. Heck, the built-in, and sadly not all that easy to find getOneofValue chokes on all the types I have. messages.txt

const playerPickup: ServerPlayerPickup = null as any;
const result1 = getOneofValue(playerPickup.kind, "heart");
console.log(result1);

const gameAction: GameAction = null as any;
const result2 = getOneofValue(gameAction.action, "cPlayerMove");
console.log(result2);

results in the Typescript compiler outputting messages such as

[{
	"resource": "/c:/Coding/GitHub/Games/noita-together/electron/main/proto/messages.ts",
	"owner": "typescript",
	"code": "2345",
	"severity": 8,
	"message": "Argument of type '{ oneofKind: \"heart\"; heart: ServerPlayerPickup_HeartPickup; } | { oneofKind: \"orb\"; orb: ServerPlayerPickup_OrbPickup; } | { ...; }' is not assignable to parameter of type 'UnknownOneofGroup'.\n  Type '{ oneofKind: \"heart\"; heart: ServerPlayerPickup_HeartPickup; }' is not assignable to type 'UnknownOneofGroup'.\n    Property 'heart' is incompatible with index signature.\n      Type 'ServerPlayerPickup_HeartPickup' is not assignable to type 'UnknownScalar | UnknownMessage | undefined'.\n        Type 'ServerPlayerPickup_HeartPickup' is not assignable to type 'UnknownMessage'.\n          Index signature for type 'string' is missing in type 'ServerPlayerPickup_HeartPickup'.",
	"source": "ts",
	"startLineNumber": 4843,
	"startColumn": 31,
	"endLineNumber": 4843,
	"endColumn": 48
}]

I personally would absolutely welcome changing the oneof representation to something more typical, like { kind: ..., value: ... } or one of the variations with a special character like $kind.

stefnotch avatar Jul 22 '22 19:07 stefnotch

I spent some time trying to see if my idea of using element access using a constant string defined in the runtime, but apparently it won't work because TS is unable to properly discriminate when using a non-literal (but constant) string to index into an object :\

Playground

const oneofKind = "$kind" as const;
const oneofValue = "value" as const;

type ExampleOneof =
    | { [oneofKind]: "a"; [oneofValue]: string; }
    | { [oneofKind]: "b"; [oneofValue]: number; }
    | { [oneofKind]: "c"; [oneofValue]: boolean; }
    | { [oneofKind]: undefined; };

declare const exampleOneof: ExampleOneof;

if (exampleOneof.$kind === "b") {
    const b = exampleOneof[oneofValue]; // OK
}
if (exampleOneof['$kind'] === "b") {
    const b = exampleOneof[oneofValue]; // OK
}
if (exampleOneof[oneofKind] === "b") {
    const b = exampleOneof[oneofValue]; // Error: Element implicitly has an 'any' type because expression of type '"value"' can't be used to index type 'ExampleOneof'.
}

Update from the future: Typescript has been updated and this doesn't throw an error anymore

stefnotch avatar Jul 22 '22 19:07 stefnotch

For what it's worth I made some minor improvements so that you can do the conversion to/from the "useful" oneof type which is especially useful if your linter requires that you annotate the return type of functions.

export function fromOneof<T extends AnyOneof>(oneof?: T): UsefulOneof<T> {
    const kind = oneof?.oneofKind;
    if (!kind) {
        // @ts-ignore
        return { kind: undefined };
    }
    const value = oneof[kind];
    // @ts-ignore
    return { kind, value };
}

export type UsefulOneof<
    T extends AnyOneof,
    K extends AnyOneofKind<T> = AnyOneofKind<T>,
    M extends AnyOneofValueMap<T, K> = AnyOneofValueMap<T, K>
> = T extends { oneofKind: keyof M } ? M[keyof M] : { kind: undefined; value?: never };

// Helper types for UsefulOneof
export type AnyOneof = { oneofKind: string; [k: string]: unknown } | { oneofKind: undefined };
export type AnyOneofKind<T extends AnyOneof> = T extends { oneofKind: string & keyof T }
    ? T['oneofKind']
    : never;
export type AnyOneofValueMap<T extends AnyOneof, K extends AnyOneofKind<T> = AnyOneofKind<T>> = {
    [k in K]: { kind: k; value: T extends { oneofKind: k } ? T[k] : never };
};

export function toOneof<T extends AnyUsefulOneof>(oneof?: T): Oneof<T> {
    const oneofKind = oneof?.kind;
    if (!oneofKind) {
        // @ts-ignore
        return { oneofKind: undefined };
    }
    // @ts-ignore
    return { oneofKind, [oneofKind]: oneof.value };
}

export type Oneof<
    T extends AnyUsefulOneof,
    K extends AnyUsefulOneofKind<T> = AnyUsefulOneofKind<T>,
    M extends AnyUsefulOneofValueMap<T, K> = AnyUsefulOneofValueMap<T, K>
> = T extends { kind: keyof M } ? M[keyof M] : { oneofKind: undefined };

// Helper types for Oneof
export type AnyUsefulOneof = { kind: string; value: unknown } | { kind: undefined; value?: never };
export type AnyUsefulOneofKind<T extends AnyUsefulOneof> = T extends { kind: string }
    ? T['kind']
    : never;
export type AnyUsefulOneofValueMap<
    T extends AnyUsefulOneof,
    K extends AnyUsefulOneofKind<T> = AnyUsefulOneofKind<T>
> = {
    [k in K]: { oneofKind: k } & {
        [p in k]: T extends { kind: string & k; value: unknown } ? T['value'] : never;
    };
};

So for example, you can now easily export the type of a concrete protobuf-ts oneof type:

import { UsefulOneof, fromOneof } from './utils';
import { MyMessageType } from 'generated/my-message-type';

export type MyUsefulSomeOneof = UsefulOneof<MyMessageType['some_oneof']>;
export function getMyMessageUsefulSomeOneof(message: MyMessageType): MyUsefulSomeOneof {
    return fromOneof(message.some_oneof);
}

jcready avatar Jul 23 '22 17:07 jcready

This was implemented in https://github.com/timostamm/protobuf-ts/pull/262

timostamm avatar Nov 17 '22 15:11 timostamm