dex-lang
dex-lang copied to clipboard
Optimize sum-type storage
Right now, when we lower sum types they (roughly) get expanded into a flattened product of all fields appearing in all cases, with only the storage corresponding to the active case actually getting initialized and used at run-time.
But this approach is highly wasteful! In particular an ADT such as
data ColorfulFloat =
RedFloat Float
BlueFloat Float
GreenFloat Float
should only use 5 (1+4) bytes of storage, not 13 (1+4+4+4)!
To improve this situation, we need to modify the Imp codegen to reuse storage already allocated for other cases while emitting the destinations for sum types.
With the recent rewrite of the Imp lowering to use type-determined representations (#1173), this is shouldn't be too hard.
The runtime representation of simplified types (base types, sums, products, dependent pairs) should be entirely handled in Imp.hs
in three functions: typeToTree
, valueToTree
, and atomToRepVal
. We probably violate the abstraction a little bit -- e.g. I notice that we use the Branch
constructor in the Case
case of translateExpr
in a sum-type-specific way too -- but those are the main ones.
The runtime representation can be an arbitrary function of the simplified type. Currently, we represent a sum type,
(t1 | t2 | ...)
as a product of a tag and a product of the components of the sum: (t1 & t2 & ...)
. But we could change that to e.g. a map from each base type to a power of that type according to the max number of occurrences across each sum type component.