zio-schema icon indicating copy to clipboard operation
zio-schema copied to clipboard

Schema isomorphism

Open jdegoes opened this issue 2 years ago • 2 comments

Increasingly, it will be necessary for ZIO Schema to support schema-first approaches: for example, a new Scala project must interoperate with existing gRPC services, whose messages are defined using protobuf.

In such case, the Scala-first approach supported by ZIO Schema will not work very well, because it would require Scala developers to painstakingly create an ADT whose automatically-derived Schema will "happen" to correspond with the schema-first schema.

A much better approach is for code-generators (or ideally, macros) to programmatically create a ZIO Schema. This ZIO Schema would ideally act as a canonical definition of the protocol, but without necessarily defining or constraining what user-defined ADT a Scala developer could use to interact with that protocol.

Thanks to migrations, this approach is already possible: code-gen could generate a ZIO Schema, and then a user could "migrate" that Schema to another one, for a user-defined ADT. The drawback to this approach is that the migration cannot known to be valid or invalid at compile-time: one must actually try it at runtime to see if the migration is possible.

A way of pulling this validation to compile-time may be introducing a notion of schema isomorphism. Independently, this concept has arisen in ZIO SQL, which @sviezypan is working on. In particular, Jaro created a type class to witness that a given schema had all the right types in all the right orders to allow inserting case classes into a relational database table.

It may be the case that pulling this concept of compile-time "schema isomorphism" may enable schema-first usage of ZIO Schema, without sacrificing all the type safety and value-oriented features that ZIO Schema provides, while enabling richer integration in libraries like ZIO SQL, without them having to introduce their own more limited notions.

How might such a change be accomplished?

One idea is adding another type parameter to Schema, such that Schema[A] would become Schema[Abstract, Concrete]. The new type parameter would represent the structure of the data type, independent of ordering or specific materialization.

For example (assuming Scala 2.13 and higher, with singleton types):

type ->*[A, B]
type ->+[A, B]
// type &[A, B] = A with B

final case class Person(name: String, age: Int)

implicit val schema: Schema[("name" ->* String) & ("age" ->* Int), Person] = ...

With such a type parameter, which is not modified by Transform nodes, it becomes possible to do type-level comparisons and transformations on the structure of a schema. Moreover, it becomes possible in libraries like ZIO SQL to say, "I can work with any schema, so long as it has N fields of types X, Y, Z, ...".

This is an early draft and more work is required to make sure this direction is feasible, but other directions should also be considered so long as they satisfy these design goals.

jdegoes avatar Dec 14 '21 18:12 jdegoes