horde-ad icon indicating copy to clipboard operation
horde-ad copied to clipboard

Represent transpose of Delta as rewrite rules

Open Mikolaj opened this issue 1 year ago • 3 comments

Here's the Tom's idea from https://github.com/Mikolaj/horde-ad/issues/95#issuecomment-1503013877:

the transposition can't be expressed as rewriting rules, because it's stateful.

It's only stateful because it's Cayley-transformed. The eval function:

eval :: R -> Delta -> DeltaMap -> DeltaMap

is really eval :: R -> Delta -> Endo DeltaMap: its codomain is Cayley-transformed (think DList difference lists) in order to make things more efficient. (addDelta becomes a bit more expensive this way, because it has to update a value in the map instead of being able to create a singleton map, but addition of maps is much cheaper because surely (.) is more efficient than Map.union.)

So maybe it can be expressed using rewrite rules, but just into the language of Endo DeltaMap, not of DeltaMap?

As far as I understand so far, we'd need more Delta constructors and we'd transpose by rewriting the Delta expressions. Afterwards, to get rid of the Delta constructors, we'd interpret them straightforwardly as linear transformation syntax, which is implemented (and probably bit-rotted) in buildDerivative, which computes forward derivatives on the basis of Delta collected in forward pass. In the pipeline that produces gradient Ast, we'd run buildDerivative with Ast codomain.

I wonder how much the extended Delta would start resembling Ast. If we switched from Delta to Ast, we'd end up with Ast nested inside Ast, which would be fine in this case. But perhaps Delta has some advantage over Ast? I guess it's linear, just as the linear sublanguage in the YOLO paper. I wonder if it's easy to extend Delta to be enough for this term rewriting, but still syntactically guaranteeing linearity (assuming it does ATM, after all the tensor extensions).

A modest version of this rewrite is to evaluate Delta to combinators of type DeltaMap -> DeltaMap, but not reify the combinators as constructors of Delta. Then, this can be presented as a rewrite, but implemented with immediate evaluation of the rewritten terms. Relevant parts of buildDerivative are then manually inlined.

Mikolaj avatar Apr 15 '23 08:04 Mikolaj