horde-ad
horde-ad copied to clipboard
Represent transpose of Delta as rewrite rules
Here's the Tom's idea from https://github.com/Mikolaj/horde-ad/issues/95#issuecomment-1503013877:
the transposition can't be expressed as rewriting rules, because it's stateful.
It's only stateful because it's Cayley-transformed. The eval function:
eval :: R -> Delta -> DeltaMap -> DeltaMap
is really
eval :: R -> Delta -> Endo DeltaMap
: its codomain is Cayley-transformed (thinkDList
difference lists) in order to make things more efficient. (addDelta
becomes a bit more expensive this way, because it has to update a value in the map instead of being able to create a singleton map, but addition of maps is much cheaper because surely(.)
is more efficient thanMap.union
.)So maybe it can be expressed using rewrite rules, but just into the language of
Endo DeltaMap
, not ofDeltaMap
?
As far as I understand so far, we'd need more Delta
constructors and we'd transpose by rewriting the Delta
expressions. Afterwards, to get rid of the Delta
constructors, we'd interpret them straightforwardly as linear transformation syntax, which is implemented (and probably bit-rotted) in buildDerivative
, which computes forward derivatives on the basis of Delta
collected in forward pass. In the pipeline that produces gradient Ast, we'd run buildDerivative
with Ast
codomain.
I wonder how much the extended Delta
would start resembling Ast
. If we switched from Delta
to Ast
, we'd end up with Ast
nested inside Ast
, which would be fine in this case. But perhaps Delta
has some advantage over Ast
? I guess it's linear, just as the linear sublanguage in the YOLO paper. I wonder if it's easy to extend Delta
to be enough for this term rewriting, but still syntactically guaranteeing linearity (assuming it does ATM, after all the tensor extensions).
A modest version of this rewrite is to evaluate Delta
to combinators of type DeltaMap -> DeltaMap
, but not reify the combinators as constructors of Delta
. Then, this can be presented as a rewrite, but implemented with immediate evaluation of the rewritten terms. Relevant parts of buildDerivative
are then manually inlined.