hwtypes
hwtypes copied to clipboard
ADT naming
After seeing the request for AnonomousProduct, this got me thinking about #71 and I wanted to give a systematic description of what types we currently have to track naming inconsistencies.
My claim is that we could create up to 16 types which represent the product of the four following binary decisions: (Sum or Product, has_class_name, has_field_names, is_ordered)
Here is the 2x2x2x2 grid where currently defined types are filled in appropriately.
Sum | field_names & is_ordered | field_names & ~is_ordered | ~field_names & is_ordered | ~field_names & ~is_ordered |
---|---|---|---|---|
has_name | TaggedUnion | |||
~has_name | Sum |
Product | field_names & is_ordered | field_names & ~is_ordered | ~field_names & is_ordered | ~field_names & ~is_ordered |
---|---|---|---|---|
has_name | Product | |||
~has_name | AnonomousProduct | Tuple |
I don't think all of these classifications are fully correct. There is not quite the equivalence between sum and product as you are describing it:
Its more like this:
Sum
types are anonymous and indexed by type
. TaggedUnion
types are also indexed by type but are named and have explicit tag names.
Tuple
types are anonymous and indexed by int
. AnonymousProduct
types are also indexed by int and are anonymous but has explicit field names. Product
types are the named variant of AnonymousProduct
.
So to me the matrix would be this:
Anonymous Class & Anonymous Fields | Anonymous Class & Named Fields | Named Class & Anonymous Fields | Named Class & Named Fields | |
---|---|---|---|---|
Int Indexed | Tuple | Anonymous Product | Product | |
Type Indexed | Sum | TaggedUnion |
I don't think Named Class & Anonymous Fields really fits into any of our syntaxes so I am fine with ignoring it. It also true we could build a type indexed tuple and int indexed sum but I just don't really see them ever being useful so we can basically ignore that. (You might argue that a unordered product might have some utility but that encoding generated by AssembledADT
does not have to encoded them in order. Tuple[Bit, Bit, Bit](...)[1]
does not necessarily have to extract [1:2]
of the underlying BitVector encoding. It can place the fields anyway it likes.)
Now I wouldn't mind changing the names to something like this:
Anonymous Class & Anonymous Fields | Anonymous Class & Named Fields | Named Class & Anonymous Fields | Named Class & Named Fields | |
---|---|---|---|---|
Int Indexed | Tuple | AnonymousProduct | Product | |
Type Indexed | Union | AnonymousSum | Sum |
There is one important distinctions that I should point out is that Named Class
types are by default not cached. While Anonymous Class
types are. This is because I want to allow named variants to add methods and getting useful and correct caching with methods is basically not possible without AST introspection.
class SpecialProduct(Product):
x = int
y = int
def delta(self):
return self.x - self.y
This does not currently work with Assembled ADT but in theory as long as the methods were pure functions of attributes accessible to AssembledADT then it could.
I suppose that the other use of an unordered product would be to get the following relationship:
UProduct[x=int, y=str] is UProduct[y=str, x=int]
. So there might be utility to that. I still see no real use case for an OrderedSum
though. When would we want OSum[int, str] is not OSum[str, int]
?
Just to be clear, I am not actually advocating completely filling out the table. Some of the empty spaces might not be very useful (Ordered Sum, Named classes + anonymous fields, etc...). The purpose of this was to enumerate the full space of types in order to categorize what we have and possibly consider more consistent names.
Also, I do think there is a meaningful semantic difference between Product and Sum. I probably should have labeled that dimension as (Product/Coproduct) to be more clear and consistent with the category theory. In our implementation perhaps this dimension coincides exactly with whether fields are int-indexed or type-indexed, but I think the dimension is better labeled as the Product/Coproduct.
While Sum
types are coproducts they are not coproducts in the most traditional sense. Usually coproducts are int
indexed and Coproduct[A, B] is not Coproduct[B, A]
now we can define the index set as {A, B}
which takes care of Sum
. However, for TaggedUnion
case is a little bit more weird as its simultaneously two coproduct at once. It is both the coproduct defined like the Sum
and a coproduct indexed by field names. It is easy to project TaggedUnion
into Sum
but you can't build a bijection. So I don't know how to define it in category theory.