dex-lang Optimize sum-type storage

Optimize sum-type storage

Open apaszke opened this issue 4 years ago • 1 comments

Right now, when we lower sum types they (roughly) get expanded into a flattened product of all fields appearing in all cases, with only the storage corresponding to the active case actually getting initialized and used at run-time.

But this approach is highly wasteful! In particular an ADT such as

data ColorfulFloat = 
  RedFloat Float
  BlueFloat Float
  GreenFloat Float

should only use 5 (1+4) bytes of storage, not 13 (1+4+4+4)!

To improve this situation, we need to modify the Imp codegen to reuse storage already allocated for other cases while emitting the destinations for sum types.

Jan 04 '21 18:01 apaszke

With the recent rewrite of the Imp lowering to use type-determined representations (#1173), this is shouldn't be too hard.

The runtime representation of simplified types (base types, sums, products, dependent pairs) should be entirely handled in Imp.hs in three functions: typeToTree, valueToTree, and atomToRepVal. We probably violate the abstraction a little bit -- e.g. I notice that we use the Branch constructor in the Case case of translateExpr in a sum-type-specific way too -- but those are the main ones.

The runtime representation can be an arbitrary function of the simplified type. Currently, we represent a sum type, (t1 | t2 | ...) as a product of a tag and a product of the components of the sum: (t1 & t2 & ...). But we could change that to e.g. a map from each base type to a power of that type according to the max number of occurrences across each sum type component.

Feb 15 '23 19:02 dougalm

dex-lang dex-lang copied to clipboard

Optimize sum-type storage

dex-lang
dex-lang copied to clipboard