proposal-record-tuple icon indicating copy to clipboard operation
proposal-record-tuple copied to clipboard

Upgradeable ArrayBuffers

Open sffc opened this issue 2 years ago • 5 comments

Possibly related: https://github.com/tc39/proposal-record-tuple/issues/134, https://github.com/tc39/proposal-record-tuple/issues/218

A potential direction this proposal could take, which would solve a new set of problems, could be "upgradeable ArrayBuffers":

  • The new language primitive is a byte array (ArrayBuffer by value).
  • The primitive can be "upgraded" by applying a schema to access data.

Something sort-of like this...

let input = { a: 1, b: 2 };

// Convert the object to its immutable value type
let primitive = Record.create(input);

// Convert the immutable value type back to an object using a pattern schema
let output = primitive.upgrade({ a: Number, b: Number });

// Access a field without upgrading (for efficiency)
primitive.get({ a: Number, b: Number }, `.a`)

What this achieves:

  1. Immutability
  2. Value Types
  3. Equality is trivial (a byte comparison)
  4. Hopefully easier to implement
  5. Step toward zero-copy deserialization

Some initial open questions:

  1. How to handle variable-length types in the record, like a String or a BigInt
    • Could allow one of these only in the final position (like Rust DSTs)
    • Or, variable-length fields could require a length prefix (but how is that represented?)
  2. Syntax for how to access fields
    • Call site needs the whole schema; how can this be expressed ergonomically?

sffc avatar Nov 30 '22 11:11 sffc

How would this handle unique symbols?

Is there a fundamental difference with JSON serialization and string comparison? What would happen if you use a different schema between serialization and deserialization?

mhofman avatar Nov 30 '22 12:11 mhofman

How would this handle unique symbols?

Good question; don't have an immediate answer to that. It could be that symbols are not permitted as values, but that would limit functionality (such as storing WeakMap keys).

Is there a fundamental difference with JSON serialization and string comparison?

The ArrayBuffer representation wouldn't be JSON; it would be a sequence of values.

This raises a question of whether a more ergonomic solution would be to say that the primitive can contain its context such that it can be upgraded without a schema. It would be much more flexible this way, but you'd lose the ability to perform random access.

What would happen if you use a different schema between serialization and deserialization?

Same type of thing as if you created a protobuf with one schema and deserialize it with a different schema.

sffc avatar Nov 30 '22 12:11 sffc

Maybe a better solution in this vein is to say that the primitive is a CBOR buffer (RFC 8949), and then we can support fully ergonomic data access operations, with the catch that they may need to walk the CBOR buffer.

let input = { hello: "world", x: 100, y: true };

// Serialize to a primitive ArrayBuffer
let primitive = input.toCBOR();

// Deep equality
console.assert(primitive === input.toCBOR());

// Access a field (may require walking the CBOR :/)
console.log(primitive.hello) // "world"

There are dozens of attempts at binary object representation formats; maybe we could choose one with more efficient random access.

sffc avatar Nov 30 '22 12:11 sffc

please support symbol in the initial proposal, it's very important in our custom serialization format

lin72h avatar Nov 30 '22 13:11 lin72h

Equality is trivial (a byte comparison)

I don't think reducing equality to byte comparison will ever be desirable. This will only work if string and bigint values are either copied into the structure (presumably as in this suggestion) or if they are interned on insertion.

I don't think copying strings is desirable. If an application has a very large string value and it decides to make a record from it, it shouldn't duplicate that memory use. Implementations of string concatenation even tend to avoid copying (instead producing "ropes" consisting of the source strings).

(As for the interning option, I think this is usually considered to be more of a performance burden than a gain, since it involves doing equality + hashing + global map manipulation on construction/free (lots of unconditional things) rather than simply doing equality on comparison).

Maxdamantus avatar Nov 30 '22 18:11 Maxdamantus