What

The idea here is that we add a variant to AT (AlgebraicType) that looks like this:

struct TransparentType {
    ty: Box<AlgebraicType>,
    name: Box<str>,
}

pub enum AlgebraicType {
    ...
    Transparent(TransparentType),
}

This is a newtype around the ty, with a name assigned to it.

There won't however be a TransparentValue or such, as the type is representationally transparent, meaning that an AV compatible with ty is also compatible with the Transparent(...) wrapper around ty.

Why

The representational transparency aforementioned is the whole point. Currently, our Identity and Address types are typed in the database as:

AlgebraicType::product([("__identity_bytes", AlgebraicType::bytes())]) // Identity
AlgebraicType::product([("__address_bytes", AlgebraicType::bytes())]) // Address

But this also means that the AV representation is AV::Product(vec![inner_av]) which incurs branching and an unnecessary heap allocation. This also extends to indices, which need to materialize AVs in the general case (although we could do a one-field-product optimization for indices). Currently, this also affects normal scans which also materialize AVs.

With ::Transparent, we can get rid of this wrapping layer and have the AV representation just be inner_av. Coupled with #1097, this makes it very cheap to store an Identity.

Alternatives

Keep things as is.
Optimize indices to unwrap a single-field product value and call it a day.
Introduce AV/T::{Identity, Address}. Arguably these are somewhat special types, and this would be even cheaper. However, these newtypes inroduced with ::Transparent could also be useful for users to make their own specially recognized types.

Bikeshed

I chose Transparent to mirrior #[repr(transparent)] in Rust.

We might also name this NewType or some such.

May 23 '24 11:05 Centril

Makes sense. It would be nice to have all single-field products optimized in this way, but I suspect it might make iteration code fairly hairy.

Another, compiler-y approach would be to have a compiler IR that supports isomorphisms (reversible encoders) iso: LogicalType -> OptimizedType, iso_op: OptimizedType -> LogicalType, iso ▷ iso_op = id, iso_op ▷ iso = id. You would write your code against the naive representation LogicalType, and automatically insert the conversions to/from OptimizedType at the boundaries of the code. Then you add compiler transforms to fuse the user operations on LogicalType with the encoding/decoding operations on OptimizedType. This would allow you to simplify the type system -- the user no longer has to distinguish between transparent and non-transparent tuples at the type level, that's merely an encoding detail. You do need a good compiler though, which is a large amount of work.

May 23 '24 19:05 kazimuth

You know I've actually thought about this problem a lot because a single field product type is actually functionally identical to a single field sum type. I think it makes sense to have a special case for it if it improves performance, although maybe the name should be singleton since it's a 1-tuple. Sum and product types are only distinguished by the operators that combine the fields after all:

() # Unit
(foo: Foo) # Singleton
(foo: Foo * bar: Bar) # Product (ordered pair)
(foo: Foo + bar: Bar) # Sum

May 24 '24 02:05 cloutiertyler

I would say however, that this is really not a priority

May 24 '24 02:05 cloutiertyler

although maybe the name should be singleton since it's a 1-tuple.

The notion of a "singleton type" is already used for a type inhabited by a single value in Scala, Haskell, and type theory:

https://stackoverflow.com/questions/33052086/what-is-a-singleton-type-exactly
https://stackoverflow.com/questions/16017294/singleton-types-in-haskell
https://ncatlab.org/nlab/show/singleton
https://www.cs.cmu.edu/~rwh/students/stone.pdf

The notion of Transparent I'm proposing here is more of a newtype with a "coercion" for values, but Transparent I feel says more about the semantics.

May 24 '24 13:05 Centril

Looks fine. Another naming candidate: alias

May 24 '24 15:05 mamcx

Does this impact BSATN at all?

Jun 13 '24 18:06 cloutiertyler

It does not affect the BSATN of AlgebraicValue at all, but it does add a non-breaking change to the BSATN of AlgebraicType itself.

Jun 13 '24 18:06 Centril

How do you codegen for a transparent type?

Jun 20 '24 14:06 gefjon

Add `AlgebraicType::Transparent { ty: Box<AlgebraicType>, name: Box<str> }`

What

Why

Alternatives

Bikeshed